Samuel King
@samuelhking.bsky.social
86 followers 180 following 16 posts
Stanford Bioengineering PhD candidate / Biological AI in Brian Hie’s lab at Arc Institute https://samuelking.cargo.site
Posts Media Videos Starter Packs
Reposted by Samuel King
arcinstitute.org
In a new preprint from @brianhie.bsky.social's lab, the team reports the first generative design of viable bacteriophage genomes.

Leveraging Evo 1 & Evo 2, they generated whole genome sequences, resulting in 16 viable phages with distinct genomic architectures.
samuelhking.bsky.social
To explore the utility of our genome design method for creating resilient phage therapies, we evolved a generated phage cocktail against three different ΦX174-resistant E. coli strains. The generated cocktail rapidly overcame resistance against all strains while ΦX174 did not.
samuelhking.bsky.social
By directly competing the phages against each other, we observed several generated phages that outcompeted ΦX174 or showed faster lytic dynamics, highlighting the ability of our method for designing high fitness mutations.
samuelhking.bsky.social
The viable generated phages harbored hundreds of novel mutations, many of which do not map to any sequence seen in nature. The cryo-EM structure of one phage revealed a genome packaging mechanism designed by Evo that was previously found lethal in rational engineering attempts.
samuelhking.bsky.social
We synthesized and tested 285 generated phage genomes in E. coli C. 16 generated phages inhibited growth in E. coli C but showed no off-target infection in E. coli strains outside of ΦX174’s natural range, demonstrating the intended host specificity.
samuelhking.bsky.social
By fine-tuning Evo 1 and Evo 2 on Microviridae sequences, we honed the models’ understanding of ΦX174-like genomes, which allowed us to generate sequences fulfilling our design criteria with a high success rate.
samuelhking.bsky.social
ΦX174 is a small Microviridae phage that infects its host E. coli C. It has a very intricate genetic architecture, making it a challenging template. We established our design criteria on ΦX174 and Microviridae sequences, including a “tropism constraint” for host specificity.
samuelhking.bsky.social
We first needed clear design criteria to guide our genome generation process. As a design template, we chose ΦX174, a classic phage in molecular biology, which was the first genome ever sequenced and synthesized.
samuelhking.bsky.social
But can DNA language models generate complete, viable genomes? To investigate this, we developed a modular framework for designing phages targeting a chosen bacteria, to maximize benefit for phage-based biotechnologies and therapeutics.
samuelhking.bsky.social
DNA language models such as Evo 1 and Evo 2, trained on millions of genomes, learn complex features of genomes at an unfathomable scale. These models work much like ChatGPT, except for DNA. We’ve previously shown that they can generate novel CRISPR-Cas systems, amongst others.
brianhie.bsky.social
We trained a genomic language model on all observed evolution, which we are calling Evo 2.

The model achieves an unprecedented breadth in capabilities, enabling prediction and design tasks from molecular to genome scale and across all three domains of life.
samuelhking.bsky.social
Designing a genome is an incredibly complex task. The overwhelming number of considerations has limited what we’ve previously been able to achieve in synthetic biology.
samuelhking.bsky.social
We chose to generate bacteriophage genomes, given their utility in biotechnology and therapeutics, and because they are safe and feasible to test in the lab. Phages are viruses that infect and kill bacteria, and are emerging as a promising strategy to combat rising antibiotic resistance.
samuelhking.bsky.social
I’ll start by recognizing that this work wouldn’t have been possible without the incredible support of my PhD advisor @brianhie, and the brilliant labmates and scientists who I had the honor of working with:
samuelhking.bsky.social
Many of the most complex and useful functions in biology emerge at the scale of whole genomes.

Today, we share our preprint “Generative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧵
Reposted by Samuel King
brianhie.bsky.social
We trained a genomic language model on all observed evolution, which we are calling Evo 2.

The model achieves an unprecedented breadth in capabilities, enabling prediction and design tasks from molecular to genome scale and across all three domains of life.
Reposted by Samuel King
adititm.bsky.social
Excited to have the first project of my PhD out!! By leveraging genomic language model Evo’s ability to learn relationships across genes (i.e., "know a gene by the company it keeps"), we show that we can use prompt-engineering to generate highly divergent proteins with retained functionality. 🧵1/N