Lightnews — Scholar-powered news

Possu Huang Lab @possuhuanglab.bsky.social · 21h

Read our preprint here: www.biorxiv.org/content/10.1... (8/8)

SLAE: Strictly Local All-atom Environment for Protein Representation

Building physically grounded protein representations is central to computational biology, yet most existing approaches rely on sequence-pretrained language models or backbone-only graphs that overlook...

www.biorxiv.org

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

Work done by Yilin Chen, @tianyu.bsky.social , Cizhang Zhao and @hkws.bsky.social . Thank you all! (7/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

SLAE projects all-atom structures onto a smooth manifold! Unguided linear interpolation between conformations in SLAE latent space decodes to coherent intermediates structures. (6/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

SLAE extends our generative coverage assessment SHAPES to all-atom, per-residue-type granularity. Now we can compare de novo all-atom protein design models and spot residue-level environment biases. (5/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

Rich in atomic-environment signal, SLAE features outperform PLMs and task-specific models across diverse, challenging downstream tasks, including binding affinity, thermostability and chemical shift prediction. All-atom structure pretraining is all you need! (4/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

The SLAE latent landscape is organized in meaningful ways beyond amino acid identity. It separates residue embeddings along features including solvent accessibility, secondary structure and structural nativeness. (3/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

We design a deliberately hard two-part task to learn compact, expressive features: a local graph encoder projects each residue’s atomic interactions into a feature vector, while a global decoder learns to compose these local environment tokens into coherent macromolecules. (2/8)

1

Possu Huang Lab @possuhuanglab.bsky.social · 21h

Introducing SLAE, our new framework to represent all-atom protein structures with residue local chemical environment tokens!
SLAE reasons over atomic interactions to recover structures and residue pairwise energetics, yielding a generalizable, physics-informed latent space. (1/8)

1 1 7

Possu Huang Lab @possuhuanglab.bsky.social · Aug 25

💻 Sampling and training code for Protpardelle-1c is now available: github.com/ProteinDesig...

Feedback and requests are welcome!

3

Possu Huang Lab @possuhuanglab.bsky.social · Aug 19

Code will be released soon on our GitHub: github.com/ProteinDesig...
Preprint: www.biorxiv.org/content/10.1...
Have fun sampling and training!

Conditional Protein Structure Generation with Protpardelle-1c

We present Protpardelle-1c, a collection of protein structure generative models with robust motif scaffolding and support for multi-chain complex generation under hotspot-conditioning. Enabling sidech...

www.biorxiv.org

3

Possu Huang Lab @possuhuanglab.bsky.social · Aug 19

Our new set of all-atom models can sample plausible sidechains without stage-2 sampling. Sequence-dependent partial diffusion behavior occurs when we mask the dummy atoms.

1

Possu Huang Lab @possuhuanglab.bsky.social · Aug 19

1 1

Possu Huang Lab @possuhuanglab.bsky.social · Aug 19

We achieve competitive results on MotifBench and the RFdiffusion/La-Proteina motif scaffolding benchmarks with both backbone-only and all-atom models, proposing scaffolds to previously unsolved problems.

1

Possu Huang Lab @possuhuanglab.bsky.social · Aug 19

We have a new collection of protein structure generative models which we call Protpardelle-1c. It builds on the original Protpardelle and is tailored for conditional generation: motif scaffolding and binder generation.

1 5 22

Possu Huang Lab @possuhuanglab.bsky.social · Jul 29

Paper: authors.elsevier.com/a/1lWEe8YyDf...

1

Possu Huang Lab @possuhuanglab.bsky.social · Jul 29

We include some additional analysis in the supplement, including secondary structure distributions.

1

Possu Huang Lab @possuhuanglab.bsky.social · Jul 29

SHAPES now published in Cell Systems!

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

New preprint from our group! We propose SHAPES, a set of metrics to quantify the distributional coverage of generative models of protein structures with embeddings at different structural hierarchies and quantify undersampling / extrapolation behaviors.

1 2

Reposted by Possu Huang Lab

Kevin K. Yang 楊凱筌 @kevinkaichuang.bsky.social · Feb 21

All-atom fixed backbone protein sequence design with FAMPNN

@richardshuai.bsky.social Talal Widatalla @possuhuanglab.bsky.social @brianhie.bsky.social

www.biorxiv.org/content/10.1...

7 30

Reposted by Possu Huang Lab

Mohammed AlQuraishi @moalquraishi.bsky.social · Jan 15

I'm organizing a Keystone symposium, along with Liz Kellogg and @possuhuanglab.bsky.social, on machine learning and macromolecules. Mar 23-26 in Keystone, Colorado. We have a great lineup and deadlines are coming up soon!

Machine Learning Applied to Macromolecular Structure and Function | Keystone Symposia

Join us at the Keystone Symposia on Machine Learning Applied to Macromolecular Structure and Function, March 2025, in Keystone, with field leaders!

www.keystonesymposia.org

17 43

Reposted by Possu Huang Lab

Kevin K. Yang 楊凱筌 @kevinkaichuang.bsky.social · Jan 15

A framework for evaluating how well generative models of protein structure match the distribution of natural structures.

@possuhuanglab.bsky.social

www.biorxiv.org/content/10.1...

Generative models capture a biased set of protein structure space

Generative models do not capture the full expressivity of PDB structures

Protein structure embeddings reveal undersampled and de novo structure space

10 43

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

Preprint: www.biorxiv.org/content/10.1...
Code: github.com/ProteinDesig...
Dataset: zenodo.org/records/1458...

2

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

Our supplement has many additional figures of the rasterized protein structure space, stratified by designable and not designable and spatially organized by ESM3 and ProtDomainSegmentor embeddings.

1 1

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

One consequence of unbiased sampling of protein structure space is a higher likelihood of finding TERtiary Motifs (TERMs) which involve complex loops, with implications for functional protein design (see Figure 5 legend for group labels).

1 1

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

Inspired by the FPD metric in EvoDiff for protein sequence distributions, we compute Fréchet distance using protein structure embeddings, also subsetted to designable and non-designable samples (FPD-D and FPD-ND).

1

Possu Huang Lab @possuhuanglab.bsky.social · Jan 15

New preprint from our group! We propose SHAPES, a set of metrics to quantify the distributional coverage of generative models of protein structures with embeddings at different structural hierarchies and quantify undersampling / extrapolation behaviors.

1 7 28