Ragnar {Groot Koerkamp}
banner
curiouscoding.nl
Ragnar {Groot Koerkamp}
@curiouscoding.nl
PhD on high troughput bioinformatics @ ETH Zurich;
IMO, ICPC, Xoogler, Rust, road-cycling, hiking, wild camping, photography
Reposted by Ragnar {Groot Koerkamp}
mim: A lightweight auxiliary index to enable fast, parallel, gzipped FASTQ parsing https://www.biorxiv.org/content/10.1101/2025.11.24.690271v1
November 27, 2025 at 5:46 PM
Reposted by Ragnar {Groot Koerkamp}
Okay, #SeqBim is over, let's get crackin' and speak about our recent preprint (joint work with @imartayan.bsky.social, Lucas Robidou, @camillemrcht.bsky.social and @npmalfoy.bsky.social)

1/
November 27, 2025 at 10:18 AM
Reposted by Ragnar {Groot Koerkamp}
We are excited that our paper "Cleanifier: Contamination removal from microbial sequences using spaced seeds of a human pangenome index" is now published at Bioinformatics (doi.org/10.1093/bioi...).

You can find it at gitlab (gitlab.com/rahmannlab/c...) or install it via PyPI or Bioconda.
Cleanifier: Contamination removal from microbial sequences using spaced seeds of a human pangenome index
AbstractMotivation. The first step when working with DNA data of human-derived microbiomes is to remove human contamination for two reasons. First, many co
doi.org
November 27, 2025 at 11:27 AM
Reposted by Ragnar {Groot Koerkamp}
Optimized k-mer search across millions of bacterial genomes on laptops https://www.biorxiv.org/content/10.1101/2025.11.23.690050v1
November 26, 2025 at 4:47 PM
Reposted by Ragnar {Groot Koerkamp}
I for one welcome the new laptop-benchmarked search tools. Stuff like this enables previously difficult or impossible study under independent research contexts. For example, my warehouse lab's grid's so terrible running all threads on the main workstation trips the circuit breaker.

🧬💻
This is AMAZING! And it's so fantastic to see Fulgor helping out in these large-scale search tasks cc @jermp.bsky.social @ale-campa.bsky.social !
Optimized k-mer search across millions of bacterial genomes on laptops https://www.biorxiv.org/content/10.1101/2025.11.23.690050v1
November 26, 2025 at 6:07 PM
Reposted by Ragnar {Groot Koerkamp}
see this is what I meant re: facet's direction. if I'm not trying to make the tiniest lib ever I'm free to make it have the best DX/UX
November 26, 2025 at 11:27 PM
Reposted by Ragnar {Groot Koerkamp}
Ok — here's the current version of the mim preprint: bit.ly/4pygyyc! Hopefully it will make it to @biorxivpreprint.bsky.social soon, but who knows with the holiday week. Anyway, by the time it hits, we'll probably already have interesting updates to share. Happy to answer questions :).
November 26, 2025 at 9:34 PM
The Cloudflare front for biorxiv is just completely broken for me at this point. Can't open anything on mobile anymore.
November 27, 2025 at 7:53 AM
Reposted by Ragnar {Groot Koerkamp}
Minimizer Density revisited: Models and Multiminimizers https://www.biorxiv.org/content/10.1101/2025.11.21.689688v1
November 22, 2025 at 2:47 AM
Reposted by Ragnar {Groot Koerkamp}
You know, we’ve got a saying
November 26, 2025 at 8:25 AM
I'll be in Cambridge first half of next week and Oxford second half!
Hit me up if you wanna hang out :)
November 25, 2025 at 4:33 PM
Reposted by Ragnar {Groot Koerkamp}
And when you're trying to write an elite Rust implementation of something, who do you want on board? That's right! I was lucky enough to convince @curiouscoding.nl to join the mim team!
This can remove a key bottleneck in high throughput sequencing analysis tasks that are decompression & parsing limited. We implemented a mim index enabled parser in C++ & are working on one in Rust (and then Python bindings for that). We hope this makes using .fastq.gz a little less horrible! 2/2
November 25, 2025 at 2:41 PM
OH. MY. GOD. TYPST IS SO SMOOOOOTHHH.

100ms recompiles on save, by the time my eyes move from Emacs to pdf viewer it's already updated. I'm literally too slow to even see it update.

Also, this is why escape now maps to save. Saving allll the time. And Lshift-># (alongside much older Rshift->$).
Unrelated to mim itself, but this is also the first preprint I've prepared in @typst.app rather than LaTeX. It was soooo much nicer. Folks; what are we doing? Why don't our journals accept manuscript sources in Typst!
Ok; mim (github.com/COMBINE-lab/...) preprint submitted! Excited for folks to see it and share thoughts. The key takeaway; mim allows the quick, one-time, building of a small auxiliary index that then allows scaling gzipped FASTQ parsing linearly in # of threads. 1/2
November 25, 2025 at 2:43 PM
Reposted by Ragnar {Groot Koerkamp}
Ok; mim (github.com/COMBINE-lab/...) preprint submitted! Excited for folks to see it and share thoughts. The key takeaway; mim allows the quick, one-time, building of a small auxiliary index that then allows scaling gzipped FASTQ parsing linearly in # of threads. 1/2
GitHub - COMBINE-lab/mim: A small, auxiliary index to massively improve parallel fastq parsing
A small, auxiliary index to massively improve parallel fastq parsing - COMBINE-lab/mim
github.com
November 25, 2025 at 2:13 PM
Reposted by Ragnar {Groot Koerkamp}
"the camera is 400 megapixels" every photograph takes up 300 MB just to take 8 out of focus pics at a time of a dog being cute and the one posted will be seen 3 inches wide on phone screens or facebook feed by people who know the dog and say "aw look at sparky" you're trying to selling cloud storage
November 24, 2025 at 6:34 PM
Once upon a time there was a dude. He thought 3ns is slow, since time=$$$. After a year of hard work, he got it down to 2ns. Then he tried it on a .gz file, and realized it takes 25ns. All his effort was in vain.

Today, we remember the friends [assembly instructions] he made along the way.
Great keynote talk on the fundamentals of storytelling by @antonyjohnston.bsky.social at the ever-brilliant AdventureX
November 24, 2025 at 5:17 PM
- CEO [2 person company]
- manager [10 reports]
- prof [25 people lab]
November 24, 2025 at 5:09 PM
Reposted by Ragnar {Groot Koerkamp}
using AI for code is great to get you 50% of the way there, especially if you were 80% there already
November 24, 2025 at 3:50 PM
Reposted by Ragnar {Groot Koerkamp}
Spicy take of the day: YAML is not *that* bad.

(If you're the one writing it)
(And the one deserializing it)
(And you can use json-schema for validation in your editor)
November 24, 2025 at 9:16 AM
Reposted by Ragnar {Groot Koerkamp}
Sure, yeah, of course. I still don't understand why clicking, like switching from issues to merge requests on a completely empty instance should take 1.5 seconds and that much RAM. I, you know, I feel like the tech stack being used has more to do with that than the feature set!
November 24, 2025 at 10:36 AM
Reposted by Ragnar {Groot Koerkamp}
Oh my God, GitLab is running on a beefy server literally next door and it feels so sluggish. Oh my God, I don't think I can use this.
November 24, 2025 at 10:19 AM
Reposted by Ragnar {Groot Koerkamp}
In more than decade of writing Rust, I have never once in my life needed back pointers for anything.
November 24, 2025 at 12:45 AM
even the default slack notification sound just screams sluggishness

why give 3 beeps when one does? you had my attention at the first and are now stealing it for another half second
November 23, 2025 at 8:46 PM
Binding escape (=caps key) in normal-mode to save 🤯
November 22, 2025 at 7:56 PM