Claus Wilke
banner
clauswilke.com
Claus Wilke
@clauswilke.com
Computational biologist, data scientist, digital artist | he, him | http://clauswilke.com/ | Opinions are my own and do not represent UT Austin.
The statistical test you would run in this situation to ask whether there was a difference in outcomes is called a chi-square test. And it will tell you in no uncertain terms that you can absolutely not distinguish between 17 and 21 deaths among 22,000 people. It's noise. P = 0.6267.
November 24, 2025 at 3:05 PM
Preliminary NTSB report, UPS Flight 2976. Left engine separated due to fatigue cracks.
www.ntsb.gov/investigatio...
November 20, 2025 at 6:10 PM
My goal is not to convince you to use R. My goal is to highlight the issues I have with Python, to express a vision for something better in the future.
November 14, 2025 at 4:08 AM
Nobody knows what data science is. ;-)

This is my definition:
November 13, 2025 at 5:34 PM
Everybody who was familiar with Nowak's research knew. Epstein shows up in the acknowledgments of every Nowak paper from the right time frame. For example here:
www.science.org/doi/full/10....
I remember reading this when papers such as this one were published and wondering who J. Epstein was.
November 13, 2025 at 4:57 AM
I just read the definition of Codd's third normal form and I don't understand a word. 🤣

en.wikipedia.org/wiki/Third_n...
November 12, 2025 at 3:51 AM
I mean I had this exact example in my original piece. 🤷‍♂️
October 23, 2025 at 8:21 PM
So just to be clear: You're in favor of people routinely running this code from the scikit-learn documentation, exactly as written? Would you also support changing the default random state to 42 so running this code gets a little simpler? If not, why not?
October 23, 2025 at 1:38 AM
Just so we're clear about the scale of the problem:
Half a million occurrences of `random_state=42` on github, and another almost 200k occurrences of `random.seed(42)`. (And that's just Python. There are also ~20k cases in R.) 1/2
October 22, 2025 at 9:24 PM
In my experience with blogging, there are often long periods where nobody cares about my posts, and then suddenly a post goes viral and I get a massive inflow of new subscribers.

To everybody who just subscribed or started following: Most of my posts are boring. Sorry. I’m doing the best I can. 😉
October 10, 2025 at 6:36 PM
This is not correct. If you have ever had an R01 in the past you can no longer apply as New Investigator.
grants.nih.gov/policy-and-c...
September 23, 2025 at 6:13 PM
Not quite what you’re expecting to see in your front yard when you’re living in the middle of a fairly large city.
September 4, 2025 at 5:13 PM
This image in Paul Krugman's post this morning caught my eye. I've always thought of Los Angeles as the epitome of sprawl and Chicago as one of the more urban US cities but apparently Chicago has a lower population density than Los Angeles.

Source:
paulkrugman.substack.com/p/yes-americ...
September 3, 2025 at 5:57 PM
I'm always amused by how committed BlueSky is to not rounding.
August 18, 2025 at 10:31 AM
3D printing of houses in Austin, TX.
August 4, 2025 at 7:23 PM
I'm glad the AI thinks I'm a "prominent figure" but obviously it hasn't read my blog. I have written about writer's block several times. (And did this search to try to unearth one of these old posts. 🤷‍♂️)
July 30, 2025 at 4:16 PM
Exactly. I'm sure you have aged a lot since this was taken and no longer look like this. You probably have a beard now.
July 29, 2025 at 11:57 PM
AI slows down experienced developers (against their expectations) in a randomized controlled trial of AI coding effectiveness.

metr.org/Early_2025_A...
July 10, 2025 at 7:10 PM
The palette is too bright in the middle. For sequential palettes, you want luminance to be monotonic.
July 10, 2025 at 2:39 AM
I also asked ChatGPT to draw Alaska in the correct size, and it gave me this. This is actually pretty good (except the broken labels of course.)
July 4, 2025 at 5:30 PM
Map of the US, as envisioned by ChatGPT. Happy 4th of July to all of you living in Torid, Ndiaha, or Persnit Walal!
July 4, 2025 at 5:20 PM
And the resulting visualizations. I wanted an example where ordering by similarity generates clearly visible groups. This dataset works great for that. I'm now doing a selection of genes that have the strongest correlations, to strengthen the effect.
wilkelab.org/SDS366/slide...
July 3, 2025 at 9:30 PM
I'm talking about this example.
July 2, 2025 at 3:29 PM
Now published: Systematic comparison of protein language models for transfer learning.

Key points:
- You don't need gigantic models. The two smaller ESM C variants work great.
- There is huge variability in performance across datasets. We have no idea why.

www.nature.com/articles/s41...
July 1, 2025 at 11:34 PM
How to schedule a committee meeting. What the AI thinks I wrote, and what I actually wrote.
June 26, 2025 at 5:44 AM