Ruoshi
@ruoshiz.bsky.social
9 followers 2 following 7 posts
computational methods for metagenomics | former Söding Lab MPI-NAT
Posts Media Videos Starter Packs
Reposted by Ruoshi
martinsteinegger.bsky.social
MMseqs2-GPU sets new standards in single query search speed, allows near instant search of big databases, scales to multiple GPUs and is fast beyond VRAM. It enables ColabFold MSA generation in seconds and sub-second Foldseek search against AFDB50. 1/n
📄 www.nature.com/articles/s41...
💿 mmseqs.com
GPU-accelerated homology search with MMseqs2 - Nature Methods
Graphics processing unit-accelerated MMseqs2 offers tremendous speedups for homology retrieval from metagenomic databases, query-centered multiple sequence alignment generation for structure predictio...
www.nature.com
ruoshiz.bsky.social
Spacedust is a collaborative effort with the amazing @milot.bsky.social and Johannes Soeding.
6/6
ruoshiz.bsky.social
Spacedust recovers previously annotated gene clusters e.g. operons, antiviral defense systems, and BGCs. It also identifies several more instances of CRISPR subtype III-E in the GTDB. 4/6
ruoshiz.bsky.social
We searched all-vs-all 1308 bacterial genomes from different genera. 1) Spacedust assigns 58% of all 4.2M genes & 35% of the unannotated genes into conserved gene clusters. 2) It offers better functional association prediction based on the congruence of KEGG module IDs.
3/6
ruoshiz.bsky.social
Spacedust finds all gene clusters significantly conserved between any two genomes in a set of input genomes, by iteratively merging smaller clusters to maximize the statistical significance of the degree of clustering and order/strand conservation, therefore being reference-free.
2/6
ruoshiz.bsky.social
Happy to see that Spacedust is now published on Nature Methods!
It combines sensitive Foldseek structure search and conserved neighborhood detection to discover functionally-associated gene clusters in prokaryotic & viral genomes.
1/6🧵
ruoshiz.bsky.social
Spacedust finds all gene clusters significantly conserved between any two genomes in a set of input genomes, by iteratively merging smaller clusters to maximize the statistical significance of the degree of clustering and order/strand conservation, therefore being reference-free.
2/6