Lightnews — Scholar-powered news

Reposted by Manuel Razo

Dmitri Petrov @petrovadmitri.bsky.social · May 15

Very excited about this work by @mrazo.bsky.social in collaboration with www.madhavmani.com bsky.app/profile/mraz... I am so lucky to be able to collaborate with such brilliant people! Learned a lot. This is our first foray into ML approaches and I am quite taken with their power 1/n

1 9 26

Manuel Razo @mrazo.bsky.social · May 15

26/n Thank you for reading. If you have any questions or comments, please reach out!

4

Manuel Razo @mrazo.bsky.social · May 15

25/n We're excited to see how this approach can be applied to other problems in evolutionary biology and beyond

1 1

Manuel Razo @mrazo.bsky.social · May 15

24/n Furthermore, we have a custom website for the paper, where you can find the HTML version as well as detailed notebooks explaining the computational approaches involved in the project. Shout-out to @quarto.org for facilitating the creation of this website!
mrazomej.github.io/antibiotic_l...

Learning the Shape of Evolutionary Landscapes: Geometric Deep Learning Reveals Hidden Structure in Phenotype-to-Fitness Maps

mrazomej.github.io

2 2

Manuel Razo @mrazo.bsky.social · May 15

23/n This work bridges evolutionary biology and machine learning, showing how geometry-aware deep learning can reveal hidden structure in biological complexity. All of the @julialang.org code utilized for this project is available at github.com/mrazomej/ant...! #OpenScience

GitHub - mrazomej/antibiotic_landscape: Repository for the exploration of the fitness landscape of antibiotic resistance.

Repository for the exploration of the fitness landscape of antibiotic resistance. - mrazomej/antibiotic_landscape

github.com

1 3

Manuel Razo @mrazo.bsky.social · May 15

22/n The broader implication: evolution may be more predictable than we thought, with organisms following constrained paths as they adapt to new environments.

1 1

Manuel Razo @mrazo.bsky.social · May 15

21/n This approach could help us better understand and predict evolutionary trajectories, with potential applications for antibiotic resistance, viral evolution, and cancer treatment.

1 1

Manuel Razo @mrazo.bsky.social · May 15

20/n As with the simulated data, our 2D nonlinear representation worked better than linear methods (like PCA) with a significantly larger number of dimensions. Simpler AND more accurate! #DataScience

1 1

Manuel Razo @mrazo.bsky.social · May 15

19/n The results? We can represent complex resistance patterns in just two dimensions while preserving key relationships!

1 2

Manuel Razo @mrazo.bsky.social · May 15

18/n Then we applied it to real-world data: E. coli evolving under different antibiotics. For this, we used the data from the incredible paper by Iwasawa et al. 2022, where they measured the fitness of E. coli evolving under different antibiotics.

1 2

Manuel Razo @mrazo.bsky.social · May 15

17/n Another cool feature of the RHVAE applied to this problem is that it allowed us to qualitatively reconstruct the underlying fitness landscapes from which the adaptive walks were drawn.

1 3

Manuel Razo @mrazo.bsky.social · May 15

16/n Moreover, with only two non-linear dimensions, the RHVAE has the same reconstruction accuracy as a 10-dimensional PCA!

1 3

Manuel Razo @mrazo.bsky.social · May 15

15/n For this, we compared the performance of a linear model (PCA), a vanilla variational autoencoder (VAE), and our RHVAE. The non-linear models accurately reconstructed the underlying structure and relationships that generated the fitness patterns.

1 2

Manuel Razo @mrazo.bsky.social · May 15

14/n From this data, we can then fit a model to reconstruct the phenotypic coordinates of each genotype only from the fitness data. In other words, given that we only get the "z-axis" in multiple environments, we ask: can we recover the relative "x" and "y" coordinates of each genotype?

1 2

Manuel Razo @mrazo.bsky.social · May 15

13/n Given this picture, we can then simulate adaptive walks in this phenotype space, and measure the fitness of the genotypes along the way.

1 2

Manuel Razo @mrazo.bsky.social · May 15

12/n We first tested this on simulated data. For this, we took the conceptual picture of Fisher’s geometric model seriously and imagined organisms with fixed coordinates in phenotype space, while the fitness landscape defined by the environment changes, giving different fitness readouts.

1 3

Manuel Razo @mrazo.bsky.social · May 15

11/n Think of it like creating a 2D map of Earth's 3D surface, but also knowing exactly how distances on the map relate to real distances anywhere on the planet. That's what our RHVAE does with complex fitness data!

1 1

Manuel Razo @mrazo.bsky.social · May 15

10/n To relax this assumption, we take advantage of the progress in geometric deep learning. More specifically, we use a neural network called a "Riemannian Hamiltonian Variational Autoencoder" (RHVAE) that not only reduces dimensionality but preserves the geometric relationships between data points

1 1 5

Manuel Razo @mrazo.bsky.social · May 15

9/n This strong assumption can limit our ability to uncover the true dimensionality of the adaptive phenotypic landscape as phenotypes could be non-linearly related to fitness. This is where our paper comes in!

1 1

Manuel Razo @mrazo.bsky.social · May 15

8/n This approach involved a linear decomposition of the fitness matrix via SVD. In other words, the authors assumed that fitness is a linear function of the phenotypic features.

1 2

Manuel Razo @mrazo.bsky.social · May 15

7/n The task is then to use a statistical model that takes as input some abstract phenotypic features and predicts the fitness of the genotype. In this way, Kinsler et al. 2020 found that 8 of these features were sufficient to predict their data

1 2

Manuel Razo @mrazo.bsky.social · May 15

6/n The idea being that the GxE (genotype-environment) variation in fitness must be due to the phenotypic effects and thus the structure of the GxE variation in fitness can be used to infer phenotypic layer without measuring any phenotypes explicitly

1 1

Manuel Razo @mrazo.bsky.social · May 15

5/n Our lab and others have taken an approach based on our ability to measure fitness in the lab for multiple genotypes in different environments. For example, Kinsler et al. 2020 & Gosh et al. 2025 determined the fitness of many yeast genotypes in different environments.

1 3

Manuel Razo @mrazo.bsky.social · May 15

4/n In other words, given that we are able to track a microbial population as it evolves in the lab, are there a limited set of phenotypic changes that are frequent enough and adaptive enough (large µs in the pop gen lingo) that keep appearing in the population over and over again?

1 2

Manuel Razo @mrazo.bsky.social · May 15

3/n A simpler question one can ask is: when populations adapts to an environment, are there a limited set of phenotypic changes that dominate the adaptive process? I.e., can we uncover the dimensionality of the adaptive phenotypic landscape we observe in experimental evolution setups?

1 7