Soumya Kundu
banner
soumyakundu.bsky.social
Soumya Kundu
@soumyakundu.bsky.social
CS PhD Candidate at Stanford. Working at the intersection of Machine Learning, Regulatory Genomics, and Complex Disorders
Pinned
This was a really fun collaboration with @amarderstein.bsky.social where we explored some of the interesting relationships between context-specific non-coding variant effects, disease, and evolution using deep learning models of chromatin accessibility in the brain and heart.
Reposted by Soumya Kundu
Excited for a major milestone in our efforts to map enhancers and interpret variants in the human genome:

The E2G Portal! e2g.stanford.edu

This collates our predictions of enhancer-gene regulatory interactions across >1,600 cell types and tissues.

Uses cases 👇

1/
September 18, 2025 at 4:14 PM
Reposted by Soumya Kundu
But how does this relate to human disease?? Through an awesome collaboration with the @anshulkundaje.bsky.social lab, we trained ChromBPNet models with scATACseq datasets for each cell type and vascular site, and predict human variant effect on a cell type/site basis @soumyakundu.bsky.social
September 10, 2025 at 3:54 PM
Reposted by Soumya Kundu
I am tremendously excited to share our work revealing the epigenomic landscape of single vascular cells. We discover that enhancers are not only cell type but vascular site specific and regulate the genetic drivers of disease risk. Let's dive in! 🧬👇 #epigenetics
www.embopress.org/doi/full/10....
Epigenomic landscape of single vascular cells reflects developmental origin and disease risk loci | Molecular Systems Biology
imageimageVascular sites have distinct susceptibility to disease. Here, through single cell epigenomic profiling and predictive machine learning modeling, this study revealed that regulatory enhancers are vascular site specific, providing insight ...
www.embopress.org
September 10, 2025 at 3:54 PM
Reposted by Soumya Kundu
Thanks to @riyavsinha.bsky.social in my lab, the IGV browser will natively support dynseq (dynamic sequence tracks) in an upcoming release. These tracks are very useful to directly visualize base-resolution scores (e.g. contribution scores from ML models, conservation etc). 1/
July 29, 2025 at 3:06 PM
Reposted by Soumya Kundu
New preprint alert!

Corgi imitates cellular gene regulation and integrates DNA sequence and trans-regulator information.

This allows Corgi to make accurate predictions in unseen cell types. Also, it can simulate trans-regulator perturbations in silico.
June 26, 2025 at 7:46 AM
Reposted by Soumya Kundu
Had a lot of fun writing this “tools of the trade” highlight for our Variant-EFFECTS technology. Check it out! 🛠️
June 11, 2025 at 2:01 PM
Reposted by Soumya Kundu
Excited to share my first PhD paper in the @sbmontgom.bsky.social lab with @tamigj.bsky.social (www.biorxiv.org/content/10.1...)! Standard QTL methods treat each gene independently. But what if a single variant regulates multiple nearby genes at once - what we call “allelic proxitropy”? 🧵 ⬇️
June 8, 2025 at 5:39 PM
Reposted by Soumya Kundu
🧠 Excited to share my main PhD project! We mapped the regulatory rules governing Glioblastoma plasticity using single-cell multi-omics and deep learning. This work is part of a two-paper series with @bayraktarlab.bsky.social @oliverstegle.bsky.social and @moritzmall.bsky.social, Preprint at end🧵👇
May 16, 2025 at 10:05 AM
Reposted by Soumya Kundu
Today was a big day for the lab. We had two back to back thesis defenses and the defenders defended with great science and character.

Congrats to DR. Kelly Cochran & DR. @soumyakundu.bsky.social on this momentous achievement.

Brilliant scientists with brilliant futures ahead. 🎉🎉🎉
May 15, 2025 at 5:19 AM
Reposted by Soumya Kundu
Delighted to share our latest work deciphering the landscape of chromatin accessibility and modeling the DNA sequence syntax rules underlying gene regulation during human fetal development! www.biorxiv.org/content/10.1... Read on for more: 🧵 1/16 #GeneReg 🧬🖥️
Dissecting regulatory syntax in human development with scalable multiomics and deep learning
Transcription factors (TFs) establish cell identity during development by binding regulatory DNA in a sequence-specific manner, often promoting local chromatin accessibility, and regulating gene expre...
www.biorxiv.org
May 3, 2025 at 6:27 PM
Reposted by Soumya Kundu
Our preprint on designing and editing cis-regulatory elements using Ledidi is out! Ledidi turns *any* ML model (or set of models) into a designer of edits to DNA sequences that induce desired characteristics.

Preprint: www.biorxiv.org/content/10.1...
GitHub: github.com/jmschrei/led...
Programmatic design and editing of cis-regulatory elements
The development of modern genome editing tools has enabled researchers to make such edits with high precision but has left unsolved the problem of designing these edits. As a solution, we propose Ledi...
www.biorxiv.org
April 24, 2025 at 12:59 PM
Reposted by Soumya Kundu
I am elated to share that our manuscript describing Variant-EFFECTS, a high-throughput technology we developed to precisely and quantitatively measure the effects of CRISPR-mediated edits on gene expression, is now published at @cellpress.bsky.social: authors.elsevier.com/c/1kxgiL7PXu...
authors.elsevier.com
April 17, 2025 at 6:20 PM
Reposted by Soumya Kundu
Thrilled that our work on coronary dominance made the cover of @cellpress.bsky.social! This beautiful image is thanks to the incredible work of @pamrc.bsky.social! 😍

#CardioSky

www.cell.com/cell/fulltex...
April 3, 2025 at 4:53 PM
Reposted by Soumya Kundu
Disease diagnostics using machine learning of B cell and T cell receptor sequences

www.science.org/doi/10.1126/...

TL;DR: BCRs ARE ALL YOU NEED!

(Well actually .... keep reading) 1/
Disease diagnostics using machine learning of B cell and T cell receptor sequences
Clinical diagnosis typically incorporates physical examination, patient history, various laboratory tests, and imaging studies but makes limited use of the human immune system’s own record of antigen ...
www.science.org
February 21, 2025 at 1:12 AM
This was a really fun collaboration with @amarderstein.bsky.social where we explored some of the interesting relationships between context-specific non-coding variant effects, disease, and evolution using deep learning models of chromatin accessibility in the brain and heart.
February 19, 2025 at 7:11 PM
Reposted by Soumya Kundu
Modern GWAS can identify 1000s of significant hits but it can be hard to turn this into biological insight. What key cellular functions link genetic variation to disease?

I'm very excited to present our new work combining associations and Perturb-seq to build interpretable causal graphs! A 🧵
January 26, 2025 at 12:13 AM
Reposted by Soumya Kundu
Our ChromBPNet preprint out!

www.biorxiv.org/content/10.1...

Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
December 25, 2024 at 11:48 PM
Reposted by Soumya Kundu
The final chapter of my PhD thesis is now out! 🎉 We compared the latest gene regulatory network (#GRN) inference methods for #single-cell multimodal datasets and evaluated their performance across various tasks. Hard to believe this journey started in March 2021 and has finally reached this point 😅🥳
We present Gene Regulatory nETwork Analsyis (GRETA), a framework to infer, compare and evaluate gene regulatory networks #GRNs. With it, we have benchmarked multimodal and unimodal GRN inference methods. Check the results here 👇
Paper: doi.org/10.1101/2024.12.20.629764
Code: github.com/saezlab/greta
December 23, 2024 at 8:49 AM
Reposted by Soumya Kundu
1/ 🎄 What’s the best gift under the tree for a computational biologist? 🎁 A new experimental assay that refines our view of gene regulation: ACCESS-ATAC! This creative idea from Richard Sherwood was developed collaboratively between his lab and mine. #Genomics
www.biorxiv.org/content/10.1...
December 23, 2024 at 3:18 PM
Reposted by Soumya Kundu
I've been working to make designing regulatory DNA that exhibits desired characteristics easier for everyone. With the following series of tools, you can go from a blank slate to designed edits in ~30 minutes using only a V100. That includes file downloading and model training.
December 10, 2024 at 5:58 PM
Reposted by Soumya Kundu
(1/10) Excited to announce our latest work! @arpita-s.bsky.social, @amanpatel100.bsky.social , and I will be presenting DART-Eval, a rigorous suite of evals for DNA Language Models on transcriptional regulatory DNA at #NeurIPS2024. Check it out! arxiv.org/abs/2412.05430
DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA
Recent advances in self-supervised models for natural language, vision, and protein sequences have inspired the development of large genomic DNA language models (DNALMs). These models aim to learn gen...
arxiv.org
December 11, 2024 at 2:30 AM
Reposted by Soumya Kundu
What cell types drive congenital heart defects (CHD)?

Some new answers in our latest preprint, where we explored:
1). Key cell types contributing to CHD genetics
2). Impact of noncoding variants on CHD risk

https://www.medrxiv.org/content/10.1101/2024.11.20.24317557v1
November 25, 2024 at 8:08 PM
Reposted by Soumya Kundu
Check out our latest work, scE2G, for mapping the target genes of enhancers from single cell data! www.biorxiv.org/content/10.1...

Amazing work led by co-first authors @mayayayas.bsky.social and @613weilin.bsky.social in a great collaboration with @jengreitz.bsky.social's lab

See 🧵 by Wei-Lin ⬇️
November 25, 2024 at 9:22 AM
Reposted by Soumya Kundu
Super excited to share our review on genomic deep learning models for non-coding variant effect prediction, with Ayesha Bajwa and Nilah Ioannidis. We’d like this review to be a useful resource, and welcome any feedback, comments, or questions! 1/4

arxiv.org/abs/2411.11158
Leveraging genomic deep learning models for non-coding variant effect prediction
The majority of genetic variants identified in genome-wide association studies of complex traits are non-coding, and characterizing their function remains an important challenge in human genetics. Gen...
arxiv.org
November 20, 2024 at 1:31 AM