Mohsen Zakeri
@mohsenzakeri.bsky.social
630 followers 130 following 15 posts
Postdoctoral Fellow at Johns Hopkins University, Computational Biology ❤️ www.mohsenzakeri.com
Posts Media Videos Starter Packs
Reposted by Mohsen Zakeri
sinamajidian.bsky.social
Great talk by Vikram @vikramshivakumar.bsky.social on studying pangenomes and synteny visualization in #WABI25
Github: github.com/vikshiv/mume...
First paper: genomebiology.biomedcentral.com/articles/10....
Second: www.biorxiv.org/content/10.1... #WABI2025
Anchor-based merging requires a common sequence (red) present in each partition. Multi-MUMs are merged by identifying overlaps between partition-specific matches in the anchor coordinate space, and a uniqueness threshold determines if a MUM is still unique in each partition after truncation. (B) String-based merging enables computation of multi-MUMs between partitions without a common sequence. An example tree (left) is shown, highlighting the use case where partial multi-MUMs specific to internal nodes (starred) can be computed by merging subclade- based partitions up a tree. (right) MUM overlaps are computed by running Mumemto on the MUM sequences, and the uniqueness threshold array ensures overlaps remain unique across the merged dataset. (C) An example Burrows-Wheeler Transform (BWT), matrix (BWM), and Longest Common Prefix (LCP) array, with sequence IDs for each suffix shown (ID). A non-maximal unique match (UM) is shown, and the uniqueness threshold for this match is found using the flanking LCP values. (D) A partial multi-MUM (in blue) is found in all-but-one sequence (excluded in red). Using two anchor sequences (red and orange), all-but-one partial MUMs can be computed using an augmented anchor-based merging method.
(A) Phylogeny of geographically diverse A. thaliana accessions (Lian et al. 2024), with broad geographical regions colored. Internal nodes are labeled with the coverage of partial multi-MUMs across the leaves of each node. Internal node partial MUMs are computed by merging subtree-based partitions progressively up the phylogeny. (B) Global multi-MUM synteny across the full dataset shown in blue (with inversions in green). Global MUMs are computed by merging all partitions together (representing the root node). Additionally, three geographically distinct subgroups are highlighted and partition-specific multi-MUMs (in purple, with inversions in pink) reveal local structural variation in centromeric regions.
Reposted by Mohsen Zakeri
robp.bsky.social
The 25th iteration of the excellent Conference for Algorithms in Bioinformatics (WABI) starts tomorrow at UMD @umdscience.bsky.social at the Brendan Iribe Center. You can find details at the website wabiconf.github.io/2025/. We'll use the tag #WABI25 for the meeting!
WABI 2025
WABI Conference on Algorithms in Bioinformatics
wabiconf.github.io
Reposted by Mohsen Zakeri
Reposted by Mohsen Zakeri
robp.bsky.social
The second keynote address at WABI '25 will be by Christina Boucher. She will talk about "Recursive Parsing and Grammar Compression in the Era of Pangenomics". PFP (& RPFP) has enabled tremendous advances in representation & indexing; this will be an exciting talk!
wabiconf.github.io/2025/talks/t...
Recursive Parsing and Grammar Compression in the Era of Pangenomics
Talk by Christina Boucher - WABI 2025
wabiconf.github.io
Reposted by Mohsen Zakeri
Reposted by Mohsen Zakeri
kuanhaochao.bsky.social
Excited to introduce LiftOn – an open-source tool for accurate, scalable liftover of genome annotations (GFF) across assemblies. 🚀

👉 Code & community: github.com/Kuanhao-Chao...

It’s been incredibly rewarding building this for the genomics community. Can’t wait for your feedback and contributions!
Reposted by Mohsen Zakeri
benlangmead.bsky.social
Excellent work, Steven & Mohsen! See thread below
mohsenzakeri.bsky.social
1/5 We introduce Movi Color, led by Steven Tan (a brilliant undergrad member of Langmead lab) for taxonomic and multi-class classification. It uses a full-text index based on the move structure and does not rely on predefined values (like k-mer length) for index building.
github.com/mohsenzakeri...
mohsenzakeri.bsky.social
5/5 Processing the reads with Movi Color is as fast as Kraken 2, and 20x faster than Metabuli’s total query time. Movi Color is able to index sets of complete genomes from many species, but uses significantly more memory. The memory footprint can be reduced by using minimizer-digestion approaches.
mohsenzakeri.bsky.social
4/5 Movi Color is 2x more accurate than Kraken 2 and Metabuli for taxonomic classification of ONT reads at the species level.
mohsenzakeri.bsky.social
3/5 Movi Color classifies a read based on the colors observed during the pseudo matching lengths (PML) computation procedure.
mohsenzakeri.bsky.social
2/5 Movi Color adds colors to BWT runs. Like in colored Bruijn graphs, colors are sets of documents, defined based on the origin of the suffixes in each BWT run. Each distinct color is stored once in the color table.
mohsenzakeri.bsky.social
1/5 We introduce Movi Color, led by Steven Tan (a brilliant undergrad member of Langmead lab) for taxonomic and multi-class classification. It uses a full-text index based on the move structure and does not rely on predefined values (like k-mer length) for index building.
github.com/mohsenzakeri...
Reposted by Mohsen Zakeri
vikramshivakumar.bsky.social
Excited to share a new update to Mumemto, scaling MUM and conserved element finding to any size pangenome! Preprint out now w/ @benlangmead.bsky.social.
Mumemto scales to the new HPRC v2 release and beyond, and can merge in future assemblies without any recomputation! 1/n
Partitioned Multi-MUM finding for scalable pangenomics
Pangenome collections are growing to hundreds of high-quality genomes. This necessitates scalable methods for constructing pangenome alignments that can incorporate newly-sequenced assemblies. We prev...
www.biorxiv.org
Reposted by Mohsen Zakeri
robp.bsky.social
The deadline for WABI 2025 has been extended (but is still rapidly approaching) wabiconf.github.io/2025/

* abstract deadline: May 12 (AoE)
* paper deadline: May 15 (AoE)

Consider submitting your exciting algorithmic bioinformatics work to the WABI conference!
WABI 2025
WABI Conference on Algorithms in Bioinformatics
wabiconf.github.io
Reposted by Mohsen Zakeri
arun-das.bsky.social
I'll also be on the job market this summer, so please reach out if you're interested!

You can find out more about me at these links:
LinkedIn: www.linkedin.com/in/arun96/
Personal Website: arundas.org
Arun Das
arundas.org
Reposted by Mohsen Zakeri
imartayan.bsky.social
Next up is Nathaniel Brown from @benlangmead.bsky.social's group presenting col-bwt, a new algorithm for computing chain statistics using multi-maximal unique matches.

www.biorxiv.org/content/10.1...
Reposted by Mohsen Zakeri
robp.bsky.social
Hey #genomics, #bioinformatics & #algorithms peeps 💻🧬. If you haven't seen the CfP for WABI '25 yet, check out the website wabiconf.github.io/2025/. It will be held at UMD @umdscience.bsky.social with Broňa Brejová & myself as co-chairs! Submit your exciting & late-breaking algorithmic work to WABI
WABI 2025
WABI Conference on Algorithms in Bioinformatics
wabiconf.github.io
Reposted by Mohsen Zakeri
robp.bsky.social
On Thurs, March 13 at 9AM (ET), @noorpratap.bsky.social will be defending his dissertation!

If you want to learn more about tree-based quantification & differential testing, or scATAC-seq preprocessing; tune in!

Talk link: umd.zoom.us/j/9873133564...

Abstract: talks.cs.umd.edu/talks/4137
Talks
talks.cs.umd.edu
Reposted by Mohsen Zakeri
vikramshivakumar.bsky.social
We ran Mumemto on 474 human assemblies from @humanpangenome.bsky.social to find syntenic regions using MUMs. Mumemto scales remarkably well to large pangenomes thanks to compressed-space algos! It took under 2 days across 7 nodes (each using ~500 GB memory).
Reposted by Mohsen Zakeri
recombseq.bsky.social
🚨 Keynotes at RECOMB-seq 2025! 🚨

🌟 Alicia Oshlack – computational transcriptomics
@aliciao.bsky.social

🌟 Rayan Chikhi – sequencing data structures
@rayanchikhi.bsky.social

🗓️ Dates: April 24–25, 2025
📍 Seoul, South Korea

recomb-seq.github.io/speakers/
Reposted by Mohsen Zakeri
benlangmead.bsky.social
Very excited to see Movi (by @mohsenzakeri.bsky.social) now out in iScience: www.cell.com/iscience/ful.... Movi builds on the "move structure" pangenome index, a compressed full-text index and close cousin to r-index. Compared to r-index, the move structure is simpler and more cache-efficient.
Movi: A fast and cache-efficient full-text pangenome index
Biocomputational method; Classification of bioinformatical subject; Genomic analysis
www.cell.com
mohsenzakeri.bsky.social
4/4 Mov isi now capable of performing count query with the backward search procedure which is now implemented for the move structure. Movi is 16 times faster than r-index while using about 3 times more memory to perform the count query.
mohsenzakeri.bsky.social
3/4 Prefetching uses a single thread while processing many reads concurrently. Using prefetching, the median latency observed for Movi’s inner loop is 91 ns.