Claus Wilke
banner
clauswilke.com
Claus Wilke
@clauswilke.com
Computational biologist, data scientist, digital artist | he, him | http://clauswilke.com/ | Opinions are my own and do not represent UT Austin.
The statistical test you would run in this situation to ask whether there was a difference in outcomes is called a chi-square test. And it will tell you in no uncertain terms that you can absolutely not distinguish between 17 and 21 deaths among 22,000 people. It's noise. P = 0.6267.
November 24, 2025 at 3:05 PM
Preliminary NTSB report, UPS Flight 2976. Left engine separated due to fatigue cracks.
www.ntsb.gov/investigatio...
November 20, 2025 at 6:10 PM
This seems important. Current AI models can't read graphs. They "see" what they expect to see, even if the data shows something else.
Introducing bluffbench, a new tool to evaluate how well LLMs actually see data plots.

When we trick LLMs with secret #RStats transformations, they can miss the visual contradiction.

bluffbench helps us measure this "blind spot" in AI coding agents. Learn more: posit.co/blog/introdu...
When plotting, LLMs see what they expect to see - Posit
Data science agents need to accurately read plots even when the content contradicts their expectations. Our testing shows today's LLMs still struggle here.
posit.co
November 19, 2025 at 4:58 PM
Reposted by Claus Wilke
As much as I'm loathe to enter into "language wars" style commentary, I think I agree with @clauswilke.com here. The lack of NSE and native missing values is really bad for Python as a data science language (as, in fairness, is the unhinged OOP situation in R)
Python is not a great language for data science. Part 2: Language features
It may be a good language for data science, but it’s not a great one.
blog.genesmindsmachines.com
November 18, 2025 at 8:25 PM
Part 2 of my deep dive into Python as a language for data science.

blog.genesmindsmachines.com/p/python-is-...
Python is not a great language for data science. Part 2: Language features
It may be a good language for data science, but it’s not a great one.
blog.genesmindsmachines.com
November 17, 2025 at 9:00 PM
Could somebody remind me what if anything of consequence Anthropic has done in the biology space?
Anthropic CEO Dario Amodei thinks AI could help find cures for most cancers, prevent Alzheimer’s, and even double the human lifespan. cbsn.ws/4oRZ8Nm
November 17, 2025 at 2:01 AM
Happy to announce that the sicegar package for sigmoidal curve fitting, which my lab developed many years ago, has a new maintainer and a new release with bug fixes and new features.
cran.r-project.org/web/packages...
sicegar: Analysis of Single-Cell Viral Growth Curves
Aims to quantify time intensity data by using sigmoidal and double sigmoidal curves. It fits straight lines, sigmoidal, and double sigmoidal curves on to time vs intensity data. Then all the fits are ...
cran.r-project.org
November 16, 2025 at 9:34 PM
Reposted by Claus Wilke
My department at UT Austin is looking to hire an Assistant Professor in Evolutionary Biology, broadly defined. Feel free to reach out to me with any questions you may have.
apply.interfolio.com/177547
Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio
apply.interfolio.com
November 13, 2025 at 3:23 PM
Reposted by Claus Wilke
I, for one, welcome our new trash panda overlords.

But for real, fascinating science on how we might be seeing the very early stages of domestication in action in wild animals. 🧪

By @marinacoladas.bsky.social for @sciam.bsky.social
City Raccoons Are Evolving to Look More Like Pets
City-dwelling raccoons seem to be evolving a shorter snout—a telltale feature of our pets and other domesticated animals
www.scientificamerican.com
November 14, 2025 at 2:27 PM
Reposted by Claus Wilke
We are hiring in Evolutionary Biology — apply to join our department!
My department at UT Austin is looking to hire an Assistant Professor in Evolutionary Biology, broadly defined. Feel free to reach out to me with any questions you may have.
apply.interfolio.com/177547
Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio
apply.interfolio.com
November 13, 2025 at 5:41 PM
I've done a lot of work in Python this fall, and it hasn't endeared me to the language at all. Why does stuff have to be so complicated when you're doing it in Python?
blog.genesmindsmachines.com/p/python-is-...
Python is not a great language for data science. Part 1: The experience
It may be a good language for data science, but it’s not a great one.
blog.genesmindsmachines.com
November 13, 2025 at 4:16 PM
I have realized over the last few days that I'm not executive-tier material. We all have to be honest about our limitations.
Every office has an executive tier whose emails are like:

saw on news , , can we replce complianc dept with web 3

and a worker tier whose emails are like:

Dear Jim,
First of all, I *love* this idea! Unfortunately, I spoke with Legal and identified a few issues with this approach. For starters…
Well, it's a little clearer now why billionaires are so invested in technology that produces better written emails.
November 13, 2025 at 3:26 PM
My department at UT Austin is looking to hire an Assistant Professor in Evolutionary Biology, broadly defined. Feel free to reach out to me with any questions you may have.
apply.interfolio.com/177547
Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio
apply.interfolio.com
November 13, 2025 at 3:23 PM
Also important to remember that while Nowak doesn't have a social media presence he's one of the most famous scientists alive. He has more Science and Nature papers than most people have as their career total, across all publication venues. He has an h index of 180 and ~170k citations.
wild to me that whenever Epstein is in the news, bsky science folk are up in arms about a couple of famous academics who barely knew the guy rather than Martin Nowak, who took millions of Epstein dollars to run his lab, hosted Epstein there often, and somehow is still teaching at Harvard
November 13, 2025 at 5:02 AM
Reposted by Claus Wilke
I see @hadley.nz write "I was lucky to have deep conversations about relational database design and Codd’s third normal form much earlier in life than usual" and wonder at what age most people have these conversation. I'm in my fifties and haven't had them yet.
hadley.github.io/25-tidyverse...
A personal history of the tidyverse
hadley.github.io
November 12, 2025 at 2:23 AM
I see @hadley.nz write "I was lucky to have deep conversations about relational database design and Codd’s third normal form much earlier in life than usual" and wonder at what age most people have these conversation. I'm in my fifties and haven't had them yet.
hadley.github.io/25-tidyverse...
A personal history of the tidyverse
hadley.github.io
November 12, 2025 at 2:23 AM
Interesting thread. Read the whole thing, till the end. Yes, all 41 posts.
Bluetorial-Jim Watson

I met Jim Watson a few times but did not know him well. However, I was greatly influenced by his book “The Double Helix”. He was a complicated human being with some very, very bad features, but some good contributions.

What follows is my personal perspective.

1/41
a cartoon says hey everybody an old man 's talking while bart simpson looks on
ALT: a cartoon says hey everybody an old man 's talking while bart simpson looks on
media.tenor.com
November 8, 2025 at 4:06 PM
Reposted by Claus Wilke
I think I understand how it can be that LLMs are both exceptionally good and quite terrible at programming. It's because there are two entirely different skillsets that we both call "good at programming." LLMs have only one of them.
blog.genesmindsmachines.com/p/llms-excel...
LLMs excel at programming—how can they be so bad at it?
My explanation for the mystery of why LLMs can be both exceptionally good and quite terrible at programming.
blog.genesmindsmachines.com
November 6, 2025 at 3:43 PM
Reposted by Claus Wilke
This!

I am continuously surprised to hear fellow colleagues say that LLMs are trash/useless for programming, while they have radically changed the way I work over the last years. Tasks that used to take a day can now be done in minutes while I fetch a drink.
I think I understand how it can be that LLMs are both exceptionally good and quite terrible at programming. It's because there are two entirely different skillsets that we both call "good at programming." LLMs have only one of them.
blog.genesmindsmachines.com/p/llms-excel...
LLMs excel at programming—how can they be so bad at it?
My explanation for the mystery of why LLMs can be both exceptionally good and quite terrible at programming.
blog.genesmindsmachines.com
November 7, 2025 at 10:21 AM
I think I understand how it can be that LLMs are both exceptionally good and quite terrible at programming. It's because there are two entirely different skillsets that we both call "good at programming." LLMs have only one of them.
blog.genesmindsmachines.com/p/llms-excel...
LLMs excel at programming—how can they be so bad at it?
My explanation for the mystery of why LLMs can be both exceptionally good and quite terrible at programming.
blog.genesmindsmachines.com
November 6, 2025 at 3:43 PM
Reposted by Claus Wilke
November 5, 2025 at 8:49 PM
If engine fell off it's most likely a maintenance problem. Look up American Airlines Flight 191, which was a DC 10 (closely related aircraft to the MD 11).
November 5, 2025 at 9:14 PM
The 21st Century version of corvée work for academics is reviewing papers or grants, writing letters of recommendation, giving public outreach lectures.
Academics in Assyria in the 7th c BC complain that admin is preventing them from doing research and teaching
November 3, 2025 at 6:29 PM
If LLMs were actually intelligent—and not just stochastic parrots—they could read all the literature on random number generation, synthesize it, and conclude that always setting the seed to the same number is probably not a good idea. Alas, they don't have that kind of awareness.
Newest LLM tell/quirk in coding assignments this semester: instead of generating code based on the CSVs that I provide, LLMs have been inventing datasets with rnorm() and sample() (and an obligatory set.seed(42)) and then making plots with the fake data.

I'm so tired.
November 2, 2025 at 11:10 PM