Brian Naughton
@btnaughton.bsky.social
490 followers 680 following 140 posts
genetics/data/programming. ex-Hexagon, ex-Stanford ex-23andMe ex-TCD http://blog.booleanbiotech.com 🇮🇪
Posts Media Videos Starter Packs
Reposted by Brian Naughton
nboyd.bsky.social
Pretty interesting that AFAICT the filtering was done after the fact (so, library 1 had no filtering). This could make it an excellent dataset for training/testing filters/rankers. Too bad it looks like the dataset is not public
btnaughton.bsky.social
Another promising VHH model just dropped today from Manifold Bio. This one builds on BindCraft and ColabDesign. MIT license!

www.biorxiv.org/content/10.1...

The success rates appear to be lower than other tools, but this is highly target-dependent.
btnaughton.bsky.social
New blogpost on the latest in AI antibody design.

Including some code to easily run Germinal and IgGM on modal!

blog.booleanbiotech.com/ai-antibody-...
Boolean Biotech
blog.booleanbiotech.com
btnaughton.bsky.social
It has been interesting to use gitingest to paste entire codebases for new tools into Gemini and ask for severe bugs (logic, incorrect variables...).

I think it found at least one this morning. I have to double-check before filing but it's a real bug from what I can tell...
Reposted by Brian Naughton
arcinstitute.org
In another preprint from the @brianhie.bsky.social Lab and @synbiogaolab.bsky.social, they introduce Germinal, a generative AI system for de novo antibody design.

Germinal produces functional nanobodies in just dozens of tests, making custom antibody design more accessible than ever before.
btnaughton.bsky.social
A REALLY nice use of nanobanana (Gemini) is cleaning up blurry old images. Of course it depends what you ask for, but it is amazing at keeping the text and content the same.
btnaughton.bsky.social
We decided to try to 3d print custom chess pieces.

The workflow of asking gemini to iterate on a design image, then uploading that image to adam.new worked amazingly well. 5 minutes work!

Left is the gemini image and right is the mesh.
btnaughton.bsky.social
I was reminded about Blackett's War by a mention on the @patio11 podcast. Terrific book about operations research / probability theory applied to real world problems.

Lots of anecdotes like this...
Reposted by Brian Naughton
nboyd.bsky.social
Recently tested some de-novo minibinders against two targets (thanks Adaptyv!) designed using our open-source design library, `mosaic`; our best method got hit rates of 7/10 and 8/10 and affinities as low as single-digit nanomolar. Wrote up some thoughts here: blog.escalante.bio/minibinder-d...
btnaughton.bsky.social
Ibex from Prescient Design arxiv.org/abs/2507.09054 github.com/prescient-de...

- antibody folding with performance like Boltz/Chai but up to 100X faster
- does not fold the complex, just the antibody
- predicts holo and apo forms
- sadly, model weights are not freely available
Reposted by Brian Naughton
maxfus.bsky.social
Now that OpenCRISPR is in nature and rekindled the 'what's-a-novel-sequence' debate, I'm happy to share an app to check this, which I built for fun some time ago.

fuerstlab.shinyapps.io/SeqNovelty/

quick 🧵
btnaughton.bsky.social
After many months heads down, we at Decade are growing! We are hiring a protein biochemist to help us radically improve cancer treatment.

If you like being early and making an impact, we are interested to hear from you! Details here: decade.bio/careers
decade.bio
btnaughton.bsky.social
Asking Claude to research binder designs metrics and it gave me my own blogpost back!

Time to start writing for the AIs? marginalrevolution.com/marginalrevo...
btnaughton.bsky.social
Vibe coded a run progress simulator with Gemini Canvas.

It worked great, until the context rot doom loop set in, and now it's difficult to fix the remaining minor bugs.

Still very impressive!

hgbrian.github.io/run_progress...
btnaughton.bsky.social
It's not too hard, though benchmark.py has too much (AI) code

- add an entry to the yaml file
- add image1.txt through image6.txt to that output dir
- run benchmark.py

I would still manually check. There are cases where >1 answer is acceptable, especially the header can be ambiguous
benchmark.py
btnaughton.bsky.social
I spent way too long on this but I made a small benchmark for OCR of biological sequences.

It's pretty incredible how poorly everything I tried works. Maybe by posting a benchmark it will lead to finding something out there that works!

github.com/hgbrian/bio_...
btnaughton.bsky.social
BindEnergyCraft claims to improve BindCraft performance using an energy-based objective.

"Code will be released soon."
btnaughton.bsky.social
Clever mosquito detector, draws a circle around the mosquito so you can squash it. $200 bzigo.com
btnaughton.bsky.social
In case anyone wants to know, the winner was Google Translate. For the single error I found (~1/1000 aas), crucially the font messed up (serif -> sans serif) so there was a clue that the OCR should not be trusted for that sequence.
btnaughton.bsky.social
Mistral OCR and Google Translate both work much better than Gemini/Claude/GPT (recommended by @josiezayner @draparente on X)

However, in both cases I stopped checking after the 1st error after ~500 aas. Mistral was G->C ("transversion"), GTranslate was SSGGG -> SSSGG ("transcription slippage"!)
btnaughton.bsky.social
It does help but it does not solve the problem, unfortunately. The error rate is a bit too high and the errors are correlated across models.
btnaughton.bsky.social
I have been copying some amino acid sequences out of PDFs and so far nothing (Gemini, Claude, ChatGPT) works at >99% accuracy. I have to manually inspect everything!

Even when you can highlight the text in the PDF, it is very often wrong! Anyone know anything that works?