Ben Orr
@benorr.bsky.social
50 followers 110 following 9 posts
PhD Candidate, UCSF Biophysics, Kortemme Lab Computational Biology & AI Lead, Animate Bio ML for Protein Design
Posts Media Videos Starter Packs
Reposted by Ben Orr
kevinkaichuang.bsky.social
Deep learning methods for protein structure prediction and design produce idealized structures. Finetuning on a set of physics-based de novo proteins improves their geometric diversity and generalization capabilities.

@benorr.bsky.social @kortemmelab.bsky.social

www.biorxiv.org/content/10.1...
benorr.bsky.social
This work highlights how augmenting existing models with informative experimental data, as presented here, could expand our exploration of designable protein space and ultimately enable more challenging design problems to be addressed than currently possible. (8/9)
benorr.bsky.social
Fine-tuning AF2 on the stable sequences’ Rosetta models improves predictions for geometrically diverse proteins across 5 protein folds. Fine-tuning on ~6k stable designs leads to better performance than fine-tuning on all 10k stable+unstable designs. (7/9)
benorr.bsky.social
Frame2seq [‪@dakpinaroglu.bsky.social‬ 2023] scores higher sequence-structure compatibility for the Rosetta models than the AF2 predictions for these stable designs, suggesting that the Rosetta models are more accurate structures than the AF2 predictions for these sequences. (6/9)
benorr.bsky.social
We extended this analysis to 10k diverse Rossmann fold proteins generated by LUCS and tested for stability using yeast display [@grocklin.bsky.social‬ 2017]. For ~6k stable designs, AF2, AF3, and ESMFold all demonstrate a strong bias toward predicting more “idealized” helix geometries. (5/9)
benorr.bsky.social
We asked whether protein structure prediction models are biased toward idealized structures for de novo proteins. Indeed, for de novo proteins with diverse geometries, AlphaFold2 predicts structures closer to an idealized de novo protein than the solved NMR structures. (4/9)
benorr.bsky.social
We find that a physics-based method (LUCS) samples greater structural diversity, approaching that observed in natural proteins, in a model protein fold than RFdiffusion, a generative model which utilizes the deep learning-based structure prediction network RoseTTAFold. (3/9)
benorr.bsky.social
In this work we explored how deep learning methods for structure prediction and design may limit our exploration of designable protein space, by favoring “idealized” structures for de novo proteins, and how to overcome these limitations with new data and improved models. (2/9)