Sina Majidian
@sinamajidian.bsky.social
1.3K followers 1.6K following 89 posts
On the academic job market | How are species compared to one another across different genomic regions? Postdoc at Langmead Lab, Johns Hopkins | Comparative #genomics at scale | Formerly at UNIL/SIB/WUR | sinamajidian.github.io
Posts Media Videos Starter Packs
Pinned
sinamajidian.bsky.social
FastOMA is out now in Nature Methods 🎉: nature.com/articles/s41592-024-02552-8 A new orthology inference algorithm that scales linearly and is highly accurate. FastOMA can process all >2000 eukaryotic UniProt ref proteomes <24 hours 🚀. Try it out github.com/DessimozLab/fastoma @dessimoz.bsky.social
FastOMA retains OMA’s high precision accuracy and even improves upon it in terms of recall, positioning it on the Pareto frontier of orthology inference methods. 
FastOMA is not only fast but also accurate. a, QfO benchmar, agreement with SwissTree reference phylogeny covering manually curated gene trees. The error bars indicate 95% confidence intervals comparing FastOMA with EnsemblCompara, Domainoid, OrthoMCL, Ortholnspector, sonicparanoid, PANTHER, OrthoFinder, Hieranoid26 and the OMA family including OMA pairs, OMA groups and OMA GETHOGs (graph-based efficient technique for HOGs).

c) A computation time comparison of FastOMA and state-of-the-art alternatives.
https://www.nature.com/articles/s41592-024-02552-8
Reposted by Sina Majidian
jhucompsci.bsky.social
hopkinsdsai.bsky.social
#HopkinsDSAI welcomes 22 new faculty members, who join more than 150 DSAI faculty members across @jhu.edu in advancing the study of data science, machine learning, and #AI and translation to a range of critical and emerging fields.

ai.jhu.edu/news/data-sc...
Reposted by Sina Majidian
robp.bsky.social
Have you recently completed (or finishing soon) a PhD in CS or a related discipline? Do you want to do research advancing the theory & practice of algorithmic genomics & build tools that people love to use? I'll be looking to hire a postdoc! Official ad coming soon:
docs.google.com/document/d/1...
Postdoc Description.docx
Title: Postdoctoral Associate Summary statement: The postdoctoral research associate is responsible for developing novel computational methodology for high-throughput sequence genomics tasks, as well ...
docs.google.com
sinamajidian.bsky.social
Genomics in Context Awards: collaborative research at the intersection of genomics, humanities, social sciences and bioethics
wellcome.org/research-fun...
Teams must include >1 researcher from life sciences
& >1 researcher from humanities, social sciences and bioethics
Genomics in Context Awards - Research Funding | Wellcome
These awards will support transdisciplinary teams to catalyse research discoveries at the intersection of genomics, humanities, social sciences and bioethics.
wellcome.org
Reposted by Sina Majidian
benlangmead.bsky.social
I've added 7 videos to my Burrows-Wheeler indexing playlist (www.youtube.com/playlist?lis...), rounding out the r-index series and adding a 5-part series on the move structure. Now 27 videos in that playlist. I aim to add videos on prefix-free parsing, PBWT, Wheeler languages/automata in the future.
Burrows-Wheeler Indexing - YouTube
Videos on : (a) the Burrows-Wheeler Transform (BWT), (b) the FM Index, which uses the BWT to construct a full-text index, (c) Wheeler graphs, (d) r-index, an...
www.youtube.com
Reposted by Sina Majidian
uncultured.carinilab.com
What are folks using for calling genes these days in isolate genomes: PGAP, Bakta, or Prokka? This is for a 70% GC genome of a very novel lineage.
sinamajidian.bsky.social
Advances in haplotype phasing and genotype imputation
Quan Sun & Yun Li
Nature Reviews Genetics 2025
www.nature.com/articles/s41...
a, A conceptual illustration of phasing. After read alignment with reference genome, we can infer or call genotypes of target individuals, but phase information (that is, information about which alleles are inherited together on the same parental chromosome) is unknown. Phasing is the process to make such inference starting from unphased genotype data. b, A conceptual illustration of imputation from array genotype data. Imputation is the process to infer genotypes at untyped markers with the aid of reference panels. Heuristically, it identifies haplotype segments in reference panels that match genotypes at typed markers for imputation of target individuals and then imputes by simply copying over the shared segments. In the right panel (after imputation), imputed genotypes at untyped markers for the target sample are denoted with lower-case letters, with the colour representing the corresponding reference haplotype from which the alleles are copied. c, A timeline of recent major developments in phasing and imputation, which begins from the introduction of positional Burrows–Wheeler transform (PBWT), a highly efficient method for haplotype representation that paved the road for more recent phasing and imputation methods focusing on computational improvements. A timeline of earlier evolvement (before 2018) is detailed in ref. 77. lcWGS, low-coverage whole genome sequencing; LRS, long-read sequencing.
Reposted by Sina Majidian
anaconesa.bsky.social
Looking for scientists working with long-read transcriptomics technologies to join a COST action proposal. Contact us!!! @nanoporetech.com @pacbio.bsky.social
sinamajidian.bsky.social
CADD: predicting the deleteriousness of variants throughout the human genome, 2019, NAR
doi.org/10.1093/nar/...

CADD v1.7, 2024, NAR
doi.org/10.1093/nar/...
Figure 1. The CADD framework. (A) Training a CADD model requires the identification of variants that are fixed or nearly fixed in human populations, but are absent in the inferred genome sequence of the human-ape ancestor (proxy-neutral variants). The sequence composition of this variant set is used to draw a matching set of proxy-deleterious variants. Using more than 60 diverse annotations, a machine learning model is trained to classify variants as proxy-neutral versus proxy-deleterious. All potential SNVs of the human reference genome are annotated using the same features, and raw CADD scores are calculated. A PHRED conversion table is derived from the relative ranking of these model scores. (B) Users provide variant sets in VCF, and CADD uses the chromosome, position, reference allele and alternative allele columns from these files. Scores are either retrieved from pre-scored files, or else variants are fully annotated and the CADD score is calculated. The PHRED-scaled score is then looked up in the conversion table, and both scores returned to the user. Users may request output files containing variant annotations.
Reposted by Sina Majidian
xian-chang.bsky.social
🦒Long read giraffe is out!🦒
Mapping long reads to pangenome graphs is ~10x faster than with GraphAligner, with veeery slightly better mapping accuracy, short variant calling, and SV genotyping than GraphAligner or Minimap2
biorxiv-bioinfo.bsky.social
Rapid, accurate long- and short-read mapping to large pangenome graphs with vg Giraffe https://www.biorxiv.org/content/10.1101/2025.09.29.678807v1
Reposted by Sina Majidian
marnixmedema.bsky.social
Very important initiative! This could really help facilitate increasing data sharing as well as appropriate attribution of data creation.
alexjprobst.bsky.social
New article on equitable reuse of public sequencing data, published in @natmicrobiol.nature.com!
Led by the Data reuse core team @lhug.bsky.social @environmicrobio.bsky.social Cristina Moraru, @geomicrosoares.bsky.social, @folker.bsky.social and with Anke Heyer and The Data Reuse Consotrium!
Reposted by Sina Majidian
stairwaytokevin.bsky.social
Whole-genome alignments revealed pennycress has nearly dichotomous genome compartmentalization: huge gene-poor pericentromeric regions (~300Mb; <1% genic) with frequent rearrangements and highly syntenic gene-rich chromosome arms (~150Mb; ~20% genic). What we call a "two-speed" genome structure. 3/
Figure 3 | Macrosynteny and genome structure across the Brassicaceae. Horizontal blue/black/orange bands represent the chromosomes of Arabidopsis thaliana, A. lyrata, MN106, and Brassica rapa (top to bottom). Chromosomes are ordered by their number from left to right. Colors represent genomic content binned hierarchically in sliding windows (400kb-overlapping 500kb) as follow: (1) within a gene annotation (including intron and UTR, orange), (2) within EDTA-annotated repeats categorized as Ty3, (3) Ty1 (copia), (4) within another repeat category, or (5) un-annotated. Grey bands are sequence-based syntenic blocks between each pair of genomes. Pennycress and B. rapa are phylogenetically proximate (both in Brassicodae supertribe), but have reduced synteny in part because of genome reshuffling in B. rapa following a whole-genome triplication event. The seven pennycress genome assemblies (horizontal bars) are binned into TRASH-defined centromeres (orange), pericentromeres (dark blue), chromosome arms (light blue) and telomeres (dark red). The colors along the chromosome segments scale physically with the size of the bin, except that centromeres and telomeres have a 1pt buffer to make it easier to see these typically small regions. Each genome is connected to its neighbor by grey polygons that represent sequence-based syntenic blocks. Plots, genomic bins, and syntenic blocks were built with DEEPSPACE (github.com/jtlovell/DEEPSPACE).
Reposted by Sina Majidian
dotnagy.bsky.social
Pleased to see this pre-printed, highlighting the completeness/accuracy of @nanoporetech.com long-read genome assembly for clinical Enterobacterales: www.biorxiv.org/content/10.1...

Thanks to colleagues @modmedmicro.bsky.social, @ukhsa.bsky.social, @genewiz.bsky.social and @oxfordbrc.bsky.social!
Reposted by Sina Majidian
recombconf.bsky.social
#RECOMB2026 will be in Thessaloniki, Greece on May 26-29, 2026. Satellites on May 24-25. Save the date!

Το συνέδριο #RECOMB2026 θα πραγματοποιηθεί στη Θεσσαλονίκη, στις 26-29 Μαΐου 2026. Οι δορυφορικές εκδηλώσεις θα διεξαχθούν στις 24-25 Μαΐου 2026. Σημειώστε την ημερομηνία!
sinamajidian.bsky.social
NCBI Orthologs
link.springer.com/article/10.1...
Journal of Molecular Evolution
Special Issue: Quest for Orthologs
NCBI Orthologs: Public Resource and Scalable Method for Computing High-Precision Orthologs Across Eukaryotic Genomes - Journal of Molecular Evolution
Orthologs are fundamental for enabling comparative genomics analyses that further our understanding of eukaryotic biology. The unprecedented increase in the availability of high-quality eukaryotic genomes necessitates scalable and accurate methods for orthology inference. The National Center for Biotechnology Information (NCBI) developed “NCBI Orthologs”, a resource and a computational pipeline designed to meet this challenge within the NCBI RefSeq framework. This system integrates protein similarity, nucleotide alignment, and microsynteny to achieve high-precision ortholog assignments across diverse eukaryotes. The pipeline leverages high-quality RefSeq annotations and processes genomes individually, ensuring scalability. Resulting ortholog data, organized into gene-level anchored sets, enables propagation of functional annotation information and facilitates comparative genomics. Critically, these data are integrated into the NCBI Gene resource, providing users with access from various entry points. The NCBI Datasets resource provides an intuitive interface to explore orthologous relationships on the web and allows bulk data download via the web, command-line tools, and an API. We detail the methodology, including anchor species selection and the decision tree used to arrive at high-confidence one-to-one orthology relationships. NCBI Orthologs is a valuable resource for facilitating functional annotation efforts and enhancing our understanding of eukaryotic gene evolution.
link.springer.com
Reposted by Sina Majidian
arnausebe.bsky.social
Happy to share the Biodiversity Cell Atlas white paper, out today in @nature.com. We look at the possibilities, challenges, and potential impacts of molecularly mapping cells across the tree of life.
www.nature.com/articles/s41...
sinamajidian.bsky.social
oh sorry, that's right, thanks for your interest!