Jim Shaw
@jimshaw.bsky.social
1.1K followers 460 following 100 posts
Postdoc at Dana-Farber and Harvard Med with Heng Li (@lh3lh3.bsky.social). Prev: UBC / UofT. I like thinking about computational biological sequence analysis and its applications to metagenomics. https://jim-shaw-bluenote.github.io
Posts Media Videos Starter Packs
Reposted by Jim Shaw
dportik.bsky.social
New pre-print from the Banfield lab, highlighting an interesting case of 1.5Mb megaplasmids found in human gut.

Plasmid genomes were resolved using #PacBio HiFi sequencing with hifiasm-meta for #metagenome assembly. Host association was detected using epigenetic signals.

doi.org/10.1101/2025...
Megaplasmids associate with Escherichia coli and other Enterobacteriaceae
Humans and animals are ubiquitously colonized by Enterobacteriaceae , a bacterial family that contains both commensals and clinically significant pathogens. Here, we report Enterobacteriaceae megaplas...
doi.org
Reposted by Jim Shaw
Do you know ~60% of human SVs fall in ~1% of GRCh38? See our new preprint: arxiv.org/abs/2509.23057 and the companion blog post on how we started this project and longdust: lh3.github.io/2025/09/29/o.... Work with Alvin Qin
Reposted by Jim Shaw
biorxiv-bioinfo.bsky.social
High-accuracy SNV calling for bacterial isolates using deep learning with AccuSNV https://www.biorxiv.org/content/10.1101/2025.09.26.678787v1
Reposted by Jim Shaw
zaminiqbal.bsky.social
Delighted to see our paper studying the evolution of plasmids over the last 100 years, now out! Years of work by Adrian Cazares, also Nick Thomson @sangerinstitute.bsky.social - this version much improved over the preprint. Final version should be open access, apols.
Thread 1/n
jimshaw.bsky.social
Super classy and much respect for updating the benchmarks Ryan. What a nice surprise. Very appreciated as a developer :).

Grats on the huge improvements @gaetanbenoit.bsky.social for metamdbg
Reposted by Jim Shaw
rrwick.bsky.social
New blog post!

metaMDBG (@gaetanbenoit.bsky.social) and Myloasm (@jimshaw.bsky.social) have had recent releases, so I updated the benchmarks from the Autocycler paper:
rrwick.github.io/2025/09/23/a...

Both tools improved considerably! Time to update your conda environments 😄
Benchmark update: metaMDBG and Myloasm
a blog for miscellaneous bioinformatics stuff
rrwick.github.io
Reposted by Jim Shaw
samuelhking.bsky.social
Many of the most complex and useful functions in biology emerge at the scale of whole genomes.

Today, we share our preprint “Generative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧵
Reposted by Jim Shaw
Reposted by Jim Shaw
annizlab.bsky.social
X-Mapper 🦠🧬🧪 - a sequence aligner developed for microbes, now on Bioconda! 🚀
• 11–24× fewer suboptimal alignments (same for human genome)
• 3–579× lower inconsistency
• improves on ~30% of reads aligned to non-target species
github.com/mathjeff/map...
bioconda.github.io/recipes/x-ma...
#microsky
Alignment algorithms represent a balance between speed and accuracy. We evaluated this balance across aligners, with accuracy measured by suboptimal alignments. We found that the relationship between alignment time (with 30 threads) and suboptimal alignment rates exhibits diminishing returns, where X-Mapper stands out as an outlier, offering the highest accuracy with competitive speed.
Reposted by Jim Shaw
New blog post – A quick look at Roche's SBX
lh3.github.io/2025/09/11/a...
jimshaw.bsky.social
Great!! Let me know what you find :)
Reposted by Jim Shaw
shenwei356.bsky.social
I sincerely appreciate the opportunity to visit @ebi.embl.org (thanks to the @embl.org Sabbatical fellowship). The guidance and support I received from Zam (@zaminiqbal.bsky.social), John (@bacpop.org) and other colleagues have been immensely valuable! You changed my career!❤️
zaminiqbal.bsky.social
Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology
LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.
www.nature.com
Reposted by Jim Shaw
zaminiqbal.bsky.social
Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology
LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.
www.nature.com
Reposted by Jim Shaw
Reposted by Jim Shaw
bioinf.bsky.social
How do you long-read sequence metagenomes? I would argue it starts with the right sample storage & DNA extraction, to enable efficient @nanoporetech.com /@pacbio.bsky.social sequencing, which we investigated in our new paper: www.biorxiv.org/content/10.1...

Massive thanks to Klara for driving this
jimshaw.bsky.social
Thanks to co-authors @lh3lh3.bsky.social @mgmarin.bsky.social and the Heng Li lab here in Dana-Farber / Harvard Med.

Much thanks to all folks who generate/deposit data.

Building an assembler from scratch has always been a goal of mine, a labour of love :).

github.com/bluenote-157...

END
GitHub - bluenote-1577/myloasm: A new high-resolution long-read metagenome assembler for even noisy reads
A new high-resolution long-read metagenome assembler for even noisy reads - bluenote-1577/myloasm
github.com
jimshaw.bsky.social
In conclusion:

1. Check out our new long-read metagenome assembler github.com/bluenote-157.... It's written from scratch, in rust!

2. Myloasm excels on ONT R10.4 data, but works for HiFi too

3. I'm really excited by its ability to enable high-resolution sleuthing for microbiome genomics

11 / N
GitHub - bluenote-1577/myloasm: A new high-resolution long-read metagenome assembler for even noisy reads
A new high-resolution long-read metagenome assembler for even noisy reads - bluenote-1577/myloasm
github.com
jimshaw.bsky.social
On a public oral ONT metagenome (from @ykiguchi.bsky.social), we assembled a lot more complete, similar (within species-level) genomes than previous methods.

So much to explore... for example, we compared 6 circular TM7 bacteria of > 93% ANI assembled from a single oral metagenome.

10 / N
jimshaw.bsky.social
For this gut sample, @mgmarin.bsky.social found two distinct ermF (erythromycin resistance) genes, with 98% similarity, spreading within Bacteroidota.

1. The distinct ermFs are spreading on two distinct MGEs.
2. There is even strain specificity, only 1/6 P. copri had it!

9 / N
jimshaw.bsky.social
With circular contigs, we can confidently analyze presence / absence of "stuff" within contigs _without worrying about binning issues_ (as much).

For example, mobile genetic elements, AMR genes that are hard to bin and assemble with short reads...?

8/N