Yunha Hwang
@microyunha.bsky.social
1.2K followers 1.1K following 28 posts
Building genomic intelligence @ Tatta Bio
Posts Media Videos Starter Packs
Pinned
microyunha.bsky.social
At Tatta Bio, we have been thinking deeply about the sequence-to-function problem. We believe that before AI can power functional prediction, we first need to rethink how we curate, manage, and share sequence data. Here, we share our initial ideas on what we are building next:
Today's sequence data infrastructure is set up for failure in the age of AI.
Building an open and collaborative sequence platform for both Human and AI scientists.
tattabio.substack.com
Reposted by Yunha Hwang
axelvisel.bsky.social
Ready to explore New Lineages of Life with @jgi.doe.gov ? 🧬🦠

Registration for our 2025 NeLLi Symposium is now open. For the first time in collaboration with @unlv.edu

Mark the date: November 6-7 in Las Vegas, NV
jgi.doe.gov
You can now register for the November NeLLi Symposium!

Join us in November for talks focused on the most recent expansions of the Tree of Life, and the latest discoveries toward the evolution of cellular complexity and microbial symbiosis. 

Learn more: jointgeno.me/NeLLi

@nigelmouncey.bsky.social
2025 NeLLi Symposium | Joint Genome Institute
Las Vegas, NV Immediately followed by 1.5-day jamboree on November 6-7
jointgeno.me
microyunha.bsky.social
We are building this infrastructure for the scientific community, and we invite feedback and collaboration from researchers at every stage. We are grateful to
the Moore Foundation for their generous support in making this project possible. Stay tuned for more updates!

www.tatta.bio/gaia
Gaia — Tatta Bio
www.tatta.bio
microyunha.bsky.social
At Tatta Bio, we have been thinking deeply about the sequence-to-function problem. We believe that before AI can power functional prediction, we first need to rethink how we curate, manage, and share sequence data. Here, we share our initial ideas on what we are building next:
Today's sequence data infrastructure is set up for failure in the age of AI.
Building an open and collaborative sequence platform for both Human and AI scientists.
tattabio.substack.com
microyunha.bsky.social
I am so grateful for all the support I received from my mentors, colleagues and collaborators over the years: @pgirguis.bsky.social, @sokrypton.org, @simrouxvirus.bsky.social, @alexjprobst.bsky.social, @annedekas.bsky.social
microyunha.bsky.social
It’s been an incredible journey building Tatta Bio with @ancornman1.bsky.social to advance AI infrastructure for biology, and I will continue to further our mission as chief scientist.
microyunha.bsky.social
My lab will couple ML and high throughput experimentation to harness the remarkable functional diversity of microbial genomes. If you are excited about the intersection of AI and microbiology, please get in touch!
microyunha.bsky.social
It’s official! 🎉 I’m thrilled to announce that I will be joining MIT as an assistant professor in a shared appointment between Biology, EECS and Schwarzman College of Computing this fall.
microyunha.bsky.social
Tatta Bio is growing! We are hiring *two positions* in Business Development and Software Engineering to lead the development of AI-enabled scientific software for open science and biological sequence interpretation. Please check out the job postings at www.tatta.bio/careers and share widely!
Job Board | Notion
Overview
www.tatta.bio
microyunha.bsky.social
Our thoughts too! (stay tuned👀) 😉
microyunha.bsky.social
As we improve Gaia Agent, we want to hear your feedback on the agent predictions. If you have suggestions on how we can increase its capabilities, please reach out! This was a major collaborative effort with @cong-ml.bsky.social , @joshuakravitz.com @nishantjha.org @ancornman1.bsky.social @Tatta Bio
microyunha.bsky.social
We tested Gaia Agent's capabilities with hypothetical genes in Mycobeterium tuberculosis. In our blog, We detail our in silico validation of Gaia Agent-predicted membrane transporter and lanthipeptide biosynthesis loci that were uncharacterized despite decades of Mtb research. Read more:
Gaia Agent: Context-Aware Functional Insights at Scale — Tatta Bio
An AI biologist discovers previously uncharacterized systems in the Mtb genome.
tatta.bio
microyunha.bsky.social
Like a human biologist, Gaia Agent considers sequence, structure and genomic context to *think* about functions of novel genes, drastically accelerating our ability to predict functions of billions of unannotated proteins across the tree of life.
microyunha.bsky.social
Can LLM agents discover novel protein functions? Introducing Gaia Agent 🌎 🤖: an AI biologist capable of reasoning across genomic contexts to predict functions of proteins! Gaia Agent is now integrated with Gaia Search at gaia.tatta.bio
microyunha.bsky.social
If you are at #NeurIPS2024 don't miss @ancornman1.bsky.social's talk on OMG/gLM2 at 9AM! @workshopmlsb.bsky.social East meeting room 11,12
Reposted by Yunha Hwang
mmzdouc.bsky.social
Are you working on natural products? We’ve just released version 4.0 of the MIBiG data standard and repository! It now includes 3059 biosynthetic gene clusters, thanks to the combined efforts of 288 expert contributors. A thread: (1/8) academic.oup.com/nar/advance-...
MIBiG 4.0: advancing biosynthetic gene cluster curation through global collaboration
Abstract. Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in ag
academic.oup.com
Reposted by Yunha Hwang
amyxlu.bsky.social
1/🧬 Excited to share PLAID, our new approach for co-generating sequence and all-atom protein structures by sampling from the latent space of ESMFold. This requires only sequences during training, which unlocks more data and annotations:

bit.ly/plaid-proteins
🧵
overview of results for PLAID!
microyunha.bsky.social
you can search for eukaryotic sequences too, and you might find interesting homology to microbial proteins! (the current database you search against is microbial)
Reposted by Yunha Hwang
martinsteinegger.bsky.social
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...
microyunha.bsky.social
Great question, translation tables 11 and 4 should be covered, and we have seen translation table 15 being accounted for in some cases. @apcamargo.bsky.social
microyunha.bsky.social
Thank you! We are building additional features (e.g. bookmarks, tags, comments), stay tuned for updates!
microyunha.bsky.social
Great suggestion -- noted!
microyunha.bsky.social
We cluster all protein embeddings across all 100 retrieved contexts, and then the top 5 most frequently occurring clusters are colored!