Elana Simon
@elanasimon.bsky.social
390 followers 10 following 9 posts
Posts Media Videos Starter Packs
elanasimon.bsky.social
✨ Explore the features yourself! (7/9)
- Interactive visualization: interplm.ai
- Explore features from every layer of ESM-2-8M
- See how proteins activate different features
- Examine structural patterns
elanasimon.bsky.social
🧪 We can also steer model predictions by adjusting feature activations, demonstrating how understanding these representations could help guide protein design (6/9)
elanasimon.bsky.social
🎯 Beyond understanding PLMs, these features have practical applications (5/9):
Finding missing annotations in protein databases
Identifying potentially new biological motifs
Suggesting locations of binding sites and functional regions
elanasimon.bsky.social
🤖 We showed LLMs can generate meaningful descriptions of many features - and these descriptions can be validated by successfully predicted which proteins would activate each feature! (4/9)
elanasimon.bsky.social
📊We identified up to 2,548 interpretable features per layer that match known biological concept annotations - compared to just 46 from individual neurons.

This suggests PLMs store biological information in superposition - multiple concepts sharing the same neurons! (3/9)
elanasimon.bsky.social
🔍 Using InterPLM, we identified features in ESM-2 that detect various biological properties, from local motifs to complex structural patterns (2/9)
- Catalytic sites
- Zinc fingers
- Targeting sequences
- Post-translational modifications
- Structural elements and many more!
elanasimon.bsky.social
🧬 What are protein language models (PLMs) actually learning about biology? Our paper introduces InterPLM - a framework that reveals interpretable features in PLMs using sparse autoencoders, giving us a window into how these models represent protein structure and function.
🧵(1/8)
Reposted by Elana Simon
kevinkaichuang.bsky.social
Mechanistic interpretability on a protein language model

www.biorxiv.org/content/10.1...
Overview of SAE methodology and representative SAE features revealed through automated activation
pattern analysis Using mechanistic interpretability to steer generations SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns