Boris Hejblum
banner
borishejblum.science
Boris Hejblum
@borishejblum.science
INSERM researcher in biostatistics & #rstats enthusiast - views are my own
https://borishejblum.science
Reposted by Boris Hejblum
I wrote about what it means to keep writing when more language feels like the last thing we need--as a computer scientist, but also as a writer.
Softly, effectively, in the age of AI
On dolphins and disobedience
open.substack.com
February 5, 2026 at 12:39 AM
Reposted by Boris Hejblum
"Ten simple rules for teaching data science": arxiv.org/abs/2602.02874

A new preprint by @minecr.bsky.social and myself. We'd love any feedback!
Ten simple rules for teaching data science
Teaching data science presents unique challenges and opportunities that cannot be fully addressed by simply borrowing pedagogical strategies from its parent disciplines of statistics and computer scie...
arxiv.org
February 4, 2026 at 4:39 PM
Reposted by Boris Hejblum
I just finished a three-year term as an editor at an international relations journal. I began at the start of the LLM era but ended right in the middle of it. Our volume of submissions tripled and our desk reject rate rose to 75%. I have some thoughts.
open.substack.com/pub/hegemon/...
The Age of Academic Slop is Upon Us
what happens when AI automates "normal science"?
open.substack.com
January 13, 2026 at 3:38 PM
Reposted by Boris Hejblum
Want to get the data out of a PDF figure? As in, the actual data – not a rough trace-along-the-lines version?

I made an app you might like: adamkucharski.github.io/pdf2plot/

It all started a few years ago... 🧵
January 13, 2026 at 9:12 PM
Reposted by Boris Hejblum
Introducing bluffbench, a new tool to evaluate how well LLMs actually see data plots.

When we trick LLMs with secret #RStats transformations, they can miss the visual contradiction.

bluffbench helps us measure this "blind spot" in AI coding agents. Learn more: posit.co/blog/introdu...
When plotting, LLMs see what they expect to see - Posit
Data science agents need to accurately read plots even when the content contradicts their expectations. Our testing shows today's LLMs still struggle here.
posit.co
November 19, 2025 at 4:55 PM
Reposted by Boris Hejblum
Will definitely include this example in my next talk on how to name files!

✅ Full marks for "make it easy to guess what the heck something is, based on it name".
Genuinely delighted to download a PhD thesis from a university repository where the author has neglected to remove the words "BITCH THIS IS YOUR THESIS" from the filename.
October 31, 2025 at 8:46 PM
Reposted by Boris Hejblum
I loved discussing "Positron for RStudio Users: A Gentle Introduction" with @simisani.bsky.social & R-Ladies Gaborone!

Check out the recording and materials:

📹 www.youtube.com/watch?v=2fOQ...
📝 ivelasq.rbind.io/talk/positro...

I hope this intro is better than the one between my three cats 😹
October 27, 2025 at 10:20 PM
Reposted by Boris Hejblum
"What Makes a Good Data Visualization?" 🤔

Check @christinezhang.bsky.social's 🤩 excellent 💯 guest lecture for 140.776 Statistical Computing at @johnshopkinssph.bsky.social

📽️: youtu.be/SeLucCb05Dk

Slides: speakerdeck.com/lcolladotor/...

#RStats @jhubiostat.bsky.social @lieberinstitute.bsky.social
[2025-10-16] What Makes a Good Data Visualization?
YouTube video by Leonardo Collado Torres
youtu.be
October 24, 2025 at 7:22 PM
Reposted by Boris Hejblum
If you haven't got a website and want to give #quarto a go, the first video in my "Quarto Websites" series with @emilhvitfeldt.bsky.social could be a good place to start:
www.youtube.com/watch?v=l7r2...
More resources in 🧵
Quarto Websites 1: Build your homepage | Charlotte Wickham & Emil Hvitfeldt | Posit
YouTube video by Posit PBC
www.youtube.com
October 20, 2025 at 9:51 PM
Reposted by Boris Hejblum
The best I've managed is to use citations as usual, then edit a CSL file to make the style you would normally see in the "References" section, the style you get inline.

Code example: github.com/cwickham/ref...
Preview: cwickham.github.io/references/
October 23, 2025 at 3:28 AM
Reposted by Boris Hejblum
We need to have a conversation about random seeds. Don't use 42.
blog.genesmindsmachines.com/p/if-your-ra...
If your random seed is 42 I will come to your office and set your computer on fire🔥
Figuratively. More likely you'll get a stern talking to.
blog.genesmindsmachines.com
October 22, 2025 at 12:49 PM
Reposted by Boris Hejblum
The linked course is an incredibly well-written, clear explanation of how LLMs work and outlines really thoughtfully what the can, and can't, do. Recommended read for absolutely everyone out there. #MicroSky #MicrobiomeSky 💻🧬
So why do I let AI in my house? Don’t I care about the environmental consequences? What about the theft of intellectual property? Don’t I understand that these are bullshit machines?

Yes, yes, yes, and yes.

It’s what I do all day long.
Modern-Day Oracles or Bullshit Machines: Introduction
A free online humanities course about how to learn and work and thrive in an AI world.
thebullshitmachines.com
October 19, 2025 at 9:01 AM
Reposted by Boris Hejblum
can't wait to do my vibe research and vibe writing and vibe analysis and eventually get vibe promoted
October 1, 2025 at 5:00 PM
Reposted by Boris Hejblum
🎨 Theming got a huge overhaul with the latest #ggplot2 release. In honour of that @teunbrand.bsky.social has written a comprehensive deep-dive into styling your plots, covering both old and new functionality. Grab a coffee and dive in!

#rstats
ggplot2 styling
This post discusses one function in ggplot2: `theme()`. Find out about the glamour of graphics in this deep-dive article.
www.tidyverse.org
October 1, 2025 at 8:10 AM
Reposted by Boris Hejblum
{tinytable} 0.14.0 for #RStats makes it super easy to draw tables in html, tex, docx, typ, md & png.

There are only a few functions to learn, but don't be fooled! Small 📦s can still be powerful.

Check out the new gallery page for fun case studies.

vincentarelbundock.github.io/tinytable/vi...
September 29, 2025 at 12:44 PM
Reposted by Boris Hejblum
"The real reason for all this cleaning work is the signal-to-noise ratio in raw data is too poor for purpose we intend. We need to improve data quality to amp the signal for our tools to find."

3/9
September 28, 2025 at 4:59 AM
Reposted by Boris Hejblum
I'm exited to announce a new resource about making slides with quarto and revealjs. This book is the combination of all the work I have done in this area, reordered and polished up

There isn't a lot of new information yet, but this format allows me to add more easily

slidecrafting-book.com
#quarto
September 24, 2025 at 4:12 PM
Reposted by Boris Hejblum
I am super hyped to finally share the first release of plumber2 with all of you. This has been the center of my attention for a big part of 2025 and I hope you'll find it a worthy update to the venerable plumber package.

The blog post will tell you more

#rstats
plumber2 0.1.0
plumber2, a complete rewrite of plumber, has landed on CRAN, providing a modern, future proof solution for creating web servers in R. Read all about the new features here.
www.tidyverse.org
September 24, 2025 at 6:52 AM
Reposted by Boris Hejblum
Really insightful post from Julie Tibshirani (spotted in LinkedIn, can't find on Bsky) reflecting on #rstats 's unique governance structure and what can be learned for other languages

jtibs.substack.com/p/if-all-the...
If all the world were a monorepo
The R ecosystem and the case for extreme empathy in software maintenance
jtibs.substack.com
September 14, 2025 at 11:29 PM
Reposted by Boris Hejblum
I am beyond excited to announce that ggplot2 4.0.0 has just landed on CRAN.

It's not every day we have a new major #ggplot2 release but it is a fitting 18 year birthday present for the package.

Get an overview of the release in this blog post and be on the lookout for more in-depth posts #rstats
ggplot2 4.0.0
A new major version of ggplot2 has been released on CRAN. Find out what is new here.
www.tidyverse.org
September 11, 2025 at 11:20 AM
Reposted by Boris Hejblum
People keep telling me they can't afford to attend #PositConf2025 next week, so I need to make a public service announcement: attending virtually is very affordable (or free!) & is amazing! Get registered & hangout with me on the Discord server!! #databs #rstats #python
September 9, 2025 at 7:32 PM
Reposted by Boris Hejblum
Glad to share our new open-access publication in Statistics in Medicine of a non-parametric method for the identification of high-dimensional surrogate markers written with Layla Parast, Rodolphe Thiébaut, and @borishejblum.science (doi.org/10.1002/sim....).
RISE: Two‐Stage Rank‐Based Identification of High‐Dimensional Surrogate Markers Applied to Vaccinology
In vaccine trials with long-term participant follow-up, it is of great importance to identify surrogate markers that accurately infer long-term immune responses. These markers offer practical advanta....
doi.org
September 8, 2025 at 9:34 AM
Reposted by Boris Hejblum
Introducing Databot: an AI assistant for exploratory data analysis in #Python and #RStats!

A research preview in Positron, Databot is a tireless pair programmer to help you explore data.

Learn more about this tool and our philosophy behind it:

🤖 posit.co/blog/introdu...
⚠️ posit.co/blog/databot...
August 29, 2025 at 2:03 PM
Reposted by Boris Hejblum
New #QuartoPub extension! Quarto output styling adds CSS rules for computational output, warnings, and messages for #rstats, #python, #julialang, and #observablejs - use the default styles, minimal styles, or add your own custom CSS andrewheiss.github.io/quarto-outpu...
August 25, 2025 at 6:02 PM