BenjMurrell
@benjmurrell.bsky.social
730 followers 3.9K following 51 posts
🇸🇪🇿🇦 Researcher at Karolinska. Comp bio. Phylogenetics. Deep learning. All in Julia. Virology. Immunology. https://scholar.google.com/citations?user=I80vy5cAAAAJ
Posts Media Videos Starter Packs
benjmurrell.bsky.social
One day remaining before this closes!
bsky.app/profile/benj...
benjmurrell.bsky.social
My lab, at Karolinska, in Stockholm, is looking for a PhD student with a computational/quantitative background to work on probabilistic/generative models of proteins (structure and sequence). The research will involve methods development, and applications in vaccine design.
benjmurrell.bsky.social
What does it look like in the log domain?
benjmurrell.bsky.social
Shoutout to the @julialang.org community, where we rely on Flux.jl, CUDA.jl, Zygote.jl, Manifolds.jl, and others as well as @makie.org for the visualizations!
benjmurrell.bsky.social
We will have a preprint up about this soon. This is just a "base model" so you can't really ask it to do anything specific (but stay tuned). We've put the code and weights up, so give it a spin:
github.com/MurrellGroup...

If you have a GPU, it is pretty fast.
benjmurrell.bsky.social
We can mask that head's ability to attend to non-self AAs, and this prevents the model from generating symmetric backbones, without damaging the in silico refoldability of the structures themselves.
benjmurrell.bsky.social
n the LLM context, "induction heads" allow copying via recognition of repeating tokens. This is a bit like that, but SE(3) and non-AR? Tangentially, is anyone aware of what induction heads look like in text diffusion models?
benjmurrell.bsky.social
We dug in a bit, and it turns out that, among all of the layers in this network, there is a single (!!) attention head that attends to the matching residue of any symmetric copies of a chain (and also to within-chain structural repeats).
benjmurrell.bsky.social
Why? Most other methods apparently train on single chains (hat tip: @sokrypton.org ), and then use eg. AA pos offsets (sometimes with multi-chain finetuning?), to handle multiple chains. Chroma trains on multi-chain structures, but maybe symmetry is less visible to random-edge GNNs?
benjmurrell.bsky.social
RFdiffusion, Chroma, and others also generate symmetric structures. But, as far as we know, they only do so if some sort of symmetry constraint is imposed during generation. We aren't doing anything like that. This model just does it spontaneously!
benjmurrell.bsky.social
After checking that we weren't just mistakenly looking at the training PDBs, we realized: these are, at least approximately, exhibiting symmetry. We saw dimers, dimers of heterodimers, trimers (though the model isn't as good at these) etc.
benjmurrell.bsky.social
Like MultiFlow this does struct+seq, with struct similar to FoldFlow-SFM, and with seq per Meta's Discrete Flow Matching. We trained this for a week on one RTX6000 Ada, expecting sketchy helices, and maybe the odd beta sheet pairing if we were lucky. Instead, we got these:
benjmurrell.bsky.social
With the ecosystem all in place, this is what it takes to 1) specify the entire model, and 2) the flow matching training objective. Just mixing and matching components:
benjmurrell.bsky.social
We've been tinkering in the protein design space for a bit now and, since we stubbornly refuse to work in any language other than Julia, we've set up a @julialang.org ecosystem for flow matching, a collection of useful layers, a protein data processing pipeline, etc etc.
benjmurrell.bsky.social
Side note: If you're looking to do a PhD in this space, there is just over a week left to apply for this position with us: bsky.app/profile/benj...
benjmurrell.bsky.social
My lab, at Karolinska, in Stockholm, is looking for a PhD student with a computational/quantitative background to work on probabilistic/generative models of proteins (structure and sequence). The research will involve methods development, and applications in vaccine design.
benjmurrell.bsky.social
We tried to set up a simple demo/tutorial model for the protein design ecosystem we've been developing, and it turned out a bit more interesting than we expected. 🧵

This was a team effort from a few people in my lab, including @antonoresten.bsky.social and others (not sure who is on this app)
benjmurrell.bsky.social
benjmurrell.bsky.social
My lab, at Karolinska, in Stockholm, is looking for a PhD student with a computational/quantitative background to work on probabilistic/generative models of proteins (structure and sequence). The research will involve methods development, and applications in vaccine design.
benjmurrell.bsky.social
Karolinska has a rich work environment, and you'll get to collaborate closely with experimentalists. If you already have a strong biology background then you'll fit right in, and if you don't then this is an excellent way to broaden your education.
benjmurrell.bsky.social
Coming from South Africa (via five years in the US), I can say that Stockholm is a wonderful place to live. The city itself is amazing, and, if you need some nature, there are forests and water everywhere. And you never need to speak any Swedish!
benjmurrell.bsky.social
My lab, at Karolinska, in Stockholm, is looking for a PhD student with a computational/quantitative background to work on probabilistic/generative models of proteins (structure and sequence). The research will involve methods development, and applications in vaccine design.
benjmurrell.bsky.social
Hah! I should try add some. I'm mostly just trying to figure out how video resolution works on this platform...