Sebastian Schmidt
tsbschm.bsky.social
Sebastian Schmidt
@tsbschm.bsky.social
Lecturer in Microbiome & Health at @apcmicrobiomeirel.bsky.social & @ucc.bsky.social

Alumnus @borklab.bsky.social

Microbiome, microbial ecology & metagenomics.
If you find microntology or our pre-annotated terms useful, please send praise via DM to @fullam.bsky.social and @vishnuprasoodanan.bsky.social

For complains, use my email.

For comments and suggestions, please look here:

github.com/grp-schmidt/...

/end
grp-schmidt/microntology
microntology: a lightweight, data-driven controlled vocabulary to describe Earth's microbial habitats. - grp-schmidt/microntology
github.com
January 13, 2026 at 4:33 PM
microntology annotations for 305k metagenomes are also available directly via Zenodo:

zenodo.org/records/1816...

5/
microntology annotations of publicly available metagenomes in the European Nucleotide Archive
microntology annotations of >300k metagenomic samples from the European Nucleotide Archive (ENA). The `tsv` file contains one row per (bio)sample and the following columns: ena_project_id : ENA projec...
zenodo.org
January 13, 2026 at 4:33 PM
And here's a different view of the data, based on further categorization of samples. Each cell in the treemap corresponds to samples from one study.

I call this "The World according to Metagenomic Sampling Bias"

4/
January 13, 2026 at 4:33 PM
The idea behind microntology is to slap multiple simple descriptive terms on a sample that together best describe it. Terms are introduced based on data availability.

We semi-manually annotated 305k publicly available metagenomes; n of samples per microntology term are shown in the plot.

3/
January 13, 2026 at 4:33 PM
In a nutshell, microntology is a very straightforward and shallow system of terms that we use to describe microbial habitats and lifestyle, for example in spire.embl.de

The list of terms we use is available on Zenodo and will receive versioned updates in the future:

zenodo.org/records/1816...

2/
SPIRE
SPIRE holds data derived from ~100k metagenomic samples in 739 studies encompassing ~500Tbp of publicly available raw metagenomes. It holds five main data types: manually curated contextual data, including annotations against a custom microntology
spire.embl.de
January 13, 2026 at 4:33 PM
Congratulations! This seems like a really cool and unique dataset, and the findings are very interesting.
December 19, 2025 at 8:20 AM
Not 100% exactly what you’re uploading, but genome assemblies can also be submitted to ENA (and will be synced to NCBI then afaik?). Or is this about creating/assigning new taxonomy IDs?
December 19, 2025 at 7:08 AM
ENA exists though…? Or is that not an option due to grant or institutional constraints?
December 18, 2025 at 10:29 PM