Sam Horsfield
@samuelhorsfield.bsky.social
1.1K followers 1.3K following 32 posts
Postdoc @ EMBL-EBI, Pathogen Informatics and Modelling Group 🦠 Working on methods to study bacterial evolution and epidemiology using pangenomics 🧬
Posts Media Videos Starter Packs
Reposted by Sam Horsfield
corriemoreau.bsky.social
UPDATE: The 2025-2026 list of faculty and postdoc positions in ecology and evolutionary biology is out! Be sure to check out this active and helpful community run resources! docs.google.com/spreadsheets...
ecoevojobs.net 2025-26
docs.google.com
Reposted by Sam Horsfield
ebi.embl.org
There are millions of openly available microbial genomes, but searching them can be slow.

Until now 🥁

Introducing LexicMap, a new alignment tool that lets scientists search these data in minutes, helping track antibiotic resistance, trace outbreaks, and more.

www.ebi.ac.uk/about/news/r...
🦠
How to rapidly search the world’s microbial DNA
By making the world’s microbial DNA easier to explore, LexicMap helps researchers track outbreaks, study antibiotic resistance, and understand microbial diversity.
www.ebi.ac.uk
Reposted by Sam Horsfield
zaminiqbal.bsky.social
Delighted to see our paper studying the evolution of plasmids over the last 100 years, now out! Years of work by Adrian Cazares, also Nick Thomson @sangerinstitute.bsky.social - this version much improved over the preprint. Final version should be open access, apols.
Thread 1/n
Reposted by Sam Horsfield
zaminiqbal.bsky.social
If you can't face reading War and Peace or my massive thread, I was interviewed on BBC Science in Action, you can hear me 12 mins into this episode (we are not the headline paper, which was on autism):
www.bbc.co.uk/sounds/play/...
samuelhorsfield.bsky.social
v1.4.1 now available on conda!
samuelhorsfield.bsky.social
A new ggCaller version is out! v1.4 includes tweaks to improve efficiency, outputs Panaroo-friendly GFFs, and enables iterative gene calling; if you have already called a gene set, you can now add more genomes either one by one or in batches github.com/bacpop/ggCal...
GitHub - bacpop/ggCaller: Bifrost graph gene caller.
Bifrost graph gene caller. Contribute to bacpop/ggCaller development by creating an account on GitHub.
github.com
Reposted by Sam Horsfield
ewanbirney.bsky.social
Are you an AI expert who wants to stay in academia and change the world by understanding the most complex things we know - living organisms? Want to lead your own group, based in Heidelberg DE, working language English? @embl.org is hiring in AI embl.wd103.myworkdayjobs.com/en-US/EMBL/j...
Group Leader – AI in Biology
Are you ready to lead groundbreaking research in AI for Biology? Join us at EMBL! We are seeking a visionary scientist to establish their own independent research group bridging innovations in machine...
embl.wd103.myworkdayjobs.com
samuelhorsfield.bsky.social
Now works with assemblies too!
samuelhorsfield.bsky.social
A little tool I've developed: ExpEvoAnalyzer (github.com/samhorsfield...) - a snakemake pipeline that compares isolate paired-read data from an experimental evolution study to a reference isolate, producing functionally-annotated SNPs in a presence/absence matrix.
GitHub - samhorsfield96/ExpEvoAnalyzer: A workflow to analyse experimental evolution data.
A workflow to analyse experimental evolution data. - samhorsfield96/ExpEvoAnalyzer
github.com
Reposted by Sam Horsfield
zaminiqbal.bsky.social
Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology
LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.
www.nature.com
Reposted by Sam Horsfield
benpatrickwill.bsky.social
Academic authors, here's a peek into the black box of journal publishing from an journal editor if you can bear it:
Reposted by Sam Horsfield
drjorhodes.com
In just a weeks time @chownbioinf.bsky.social is cycling over 200km to the @bsmm-meeting.bsky.social in Norwich, to raise money for @aspertrust.bsky.social

This is a huge feat, and for such a great cause. Please consider sponsoring Harry! www.justgiving.com/page/harry-c...
Reposted by Sam Horsfield
mikeblazanin.bsky.social
Looking forward to seeing everyone, new and old, at the Microbial Population Biology GRS + GRC in just a couple days!

go.bsky.app/GGxRjzC
samuelhorsfield.bsky.social
Really nice work guys, glad to see this is out!
Reposted by Sam Horsfield
zaminiqbal.bsky.social
Delighted to see this paper from danderson123.bsky.social 's PhD out. We have been building tools for AMR gene detection for over a decade now, but multicopy genes remain challenging. Dan shows that with a gene-space de Bruijn graph and long reads, you can do well
www.biorxiv.org/content/10.1...
Reposted by Sam Horsfield
biorxiv-bioinfo.bsky.social
Amira: gene-space de Bruijn graphs to improve the detection of AMR genes from bacterial long reads https://www.biorxiv.org/content/10.1101/2025.05.16.654303v1
Reposted by Sam Horsfield
ebi.embl.org
Tracking different serotypes of Streptococcus pneumoniae can be tricky.

GNASTy is a scalable analysis method for use with portable Nanopore Adaptive Sampling for real-time detection of S. pneumoniae, helping track vaccine performance.

Find out more 👇

genome.cshlp.org/content/earl...
🧬🖥️
samuelhorsfield.bsky.social
Great to see our work on GNASTy made it into the long-read special issue at Genome Research alongside some super innovative applications and methods!
genomeresearch.bsky.social
SPECIAL ISSUE Part 2! This month @genomeresearch.bsky.social publishes a diverse collection of articles offering novel biological and clinical insights gained using long-read DNA and RNA sequencing technologies and other long molecule approaches.
tinyurl.com/Genome-Res-3...
Reposted by Sam Horsfield
hamishoz.bsky.social
Australia’s reefs are on fire 🔥
samuelhorsfield.bsky.social
A massive thanks to Basil Fok, Yuhan Fu, @paulturnermicro.bsky.social, @bacpop.org and Nick Croucher for all their hard work getting this out.
samuelhorsfield.bsky.social
We show that when a novel variant, in this case a serotype, is present in a sample but not captured in the target database, GNASTy enables greater target enrichment compared to linear alignment. GNASTy also works well on complex samples potentially containing multiple serotypes.
samuelhorsfield.bsky.social
Finally, we wanted to improve the ability of NAS to enrich novel variants not present in a target database. We developed GNASTy (Graph-based Nanopore Adaptive Sampling Typing), using graph pseudoalignment to enable flexible, and thus more sensitive, alignment of reads compared to linear references.
samuelhorsfield.bsky.social
We then targeted the Capsular Biosynthetic Locus (CBL) of S. pneumoniae, the operon which defines its serotype. We show targeting the CBL is much better at distinguishing S. pneumoniae from closely related species than using NAS for the whole genome, and can detect multiple serotypes at once.
samuelhorsfield.bsky.social
Before developing GNASTy, we first applied NAS to mock communities containing Streptococcus pneumoniae mixed with increasingly closely-related bacterial strains, showing that whole genome enrichment performs worse the more closely related a target genome is to non-target genomes in a mixed sample.