Niklas Kempynck
@niklaskemp.bsky.social
110 followers 120 following 16 posts
PhD Student at the Stein Aerts Lab of Computational Biology. Studying brain genomics
Posts Media Videos Starter Packs
Reposted by Niklas Kempynck
steinaerts.bsky.social
We have two open positions for a ML and a LLM engineer to launch a machine learning expertise unit in our center @vibai.bsky.social, see vib.ai/en/opportuni...
vib.ai
Reposted by Niklas Kempynck
scverse.bsky.social
We will have our next community meeting on Tuesday, 2025-09-16 at 18:00 CEST! Niklas Kempynck will be presenting on CREsted, a package for training enhancer models on scATAC-seq data.
(Zoom registration link and more information in thread!)
🧵
Reposted by Niklas Kempynck
Reposted by Niklas Kempynck
steinaerts.bsky.social
One thousand candidate enhancers tested in vivo in the mouse brain! A massive resource and oh so useful as validation set for genome-wide enhancer prediction methods. Super fun to be involved in one of the papers: ‘the prediction challenge paper’ by Nelson&Niklas et al www.cell.com/cell-genomic...
niklaskemp.bsky.social
Make sure to also check out the other studies part of the larger effort on identifying and validating enhancer tools.
alleninstitute.org
In the battle against brain disease, researchers can now rely on a new arsenal of genetic tools – The Armamentarium.

Together with scientists from across the NIH BRAIN Initiative, we’ve created and published over 1000 new enhancer AAV vectors.

🧠📈
niklaskemp.bsky.social
This study was done together with Nelson Johansen and supervised by Trygve Bakken at the @alleninstitute.org. Thanks to all co-authors for the great inter-lab collaboration! Also a personal shoutout to the members in @steinaerts.bsky.social lab for a nice team effort and to Stein for guidance.
niklaskemp.bsky.social
Check out our work on evaluating methods for predicting in vivo cell enhancer activity in the mouse cortex! Combined, scATAC peak specificity and sequence-based CREsted predictions gave the best predictive performance, aiming to advance genetic tool design for cell targeting in the brain.
Evaluating methods for the prediction of cell-type-specific enhancers in the mammalian cortex
Johansen et al. report the results of a community challenge to predict functional enhancers targeting specific brain cell types. By comparing multi-omics machine learning approaches using in vivo data...
www.cell.com
Reposted by Niklas Kempynck
steinaerts.bsky.social
Very proud of two new preprints from the lab:
1) CREsted: to train sequence-to-function deep learning models on scATAC-seq atlases, and use them to decipher enhancer logic and design synthetic enhancers. This has been a wonderful lab-wide collaborative effort. www.biorxiv.org/content/10.1...
CREsted: modeling genomic and synthetic cell type-specific enhancers across tissues and species
Sequence-based deep learning models have become the state of the art for the analysis of the genomic regulatory code. Particularly for transcriptional enhancers, deep learning models excel at decipher...
www.biorxiv.org
niklaskemp.bsky.social
Also check out Hannah’s thread on our latest preprint on HyDrop v2, an open-source platform for scATAC-sequencing, and a great, cost-efficient way of generating data for S2F models. 🙌
hannahdckmnkn.bsky.social
Our new preprint is out! We optimized our open-source platform, HyDrop (v2), for scATAC sequencing and generated new atlases for the mouse cortex and Drosophila embryo with 607k cells. Now, we can train sequence-to-function models on data generated with HyDrop v2!
www.biorxiv.org/content/10.1...
Data collected with the new sequencing platform HyDrop v2 is shown. First, a schematic overview of the bead batches of the microfluidic beads is followed by a tSNE and a barplot showing the costs in comparison to 10x Genomics. 
Then, a track of mouse data (cortex) is shown together with nucleotide contribution scores in the FIRE enhancer in microglia. Here, the HyDrop and 10x based models show the same contributions. 
On the right, the Drosophila embryo collection is explained; in the paper HyDrop v2 and 10x data are compared to sciATAC data. Then, a nucleotide contribution score is also shown, whereas HyDrop v2 and 10x models show the same contribution, just as in mouse.
niklaskemp.bsky.social
CREsted is available at github.com/aertslab/CRE.... Analysis notebooks can be found at github.com/aertslab/CRE.... All models developed for this preprint and in previous work are available in CREsted through crested.get_model(). We look forward to your feedback!
niklaskemp.bsky.social
This was a big collaborative effort, together with @seppedewinter.bsky.social , and with great contributions from @casblaauw.bsky.social , Vasilis and many others. A special shoutout to @lukasmahieu.bsky.social who professionalized the package, and to @steinaerts.bsky.social for supervising.
niklaskemp.bsky.social
Finally, we train a model on a full-development zebrafish scATAC-seq atlas, and use it to design and in vivo validate cell type- and timepoint-specific enhancers with a high success rate. We also attempt to modulate reporter strength over two cell types.
niklaskemp.bsky.social
In a new functionality to CREsted, we explore Borzoi fine-tuning to mouse motor cortex scATAC-seq data. We show that fine-tuned models and smaller models from scratch have a near-identical performance.
niklaskemp.bsky.social
We also study enhancer code inside human cancer cell lines and glioma biopsies and find that enhancer codes between Mesenchymal-like glioblastoma and melanoma states are more similar compared to glioblastoma biopsy data.
niklaskemp.bsky.social
Next, we validated CREsted-identified motif instances from a human PBMC model with ChIP-seq data. We further show that gene locus predictions can be used to simulate the effect of TF degradation on chromatin accessibility.
niklaskemp.bsky.social
We use the mouse cortex model to highlight CREsted’s gene locus prediction capabilities, both in unseen chromosomes and across species. This presents a powerful tool for potentially annotating genomes across species at high resolution.
niklaskemp.bsky.social
We first demonstrate CREsted’s functionality by providing a complete data-driven analysis of mouse motor cortex enhancer codes across cell types. Through matched scRNA-seq data, we link motifs to likely TF candidates.
niklaskemp.bsky.social
CREsted starts from the outputs of established scATAC preprocessing pipelines, and trains sequence-to-function models on chromatin accessibility per cell type. It provides complete motif analysis tools to infer cell type-specific enhancer codes and holds a comprehensive
enhancer design toolbox.
niklaskemp.bsky.social
We released our preprint on the CREsted package. CREsted allows for complete modeling of cell type-specific enhancer codes from scATAC-seq data. We demonstrate CREsted’s robust functionality in various species and tissues, and in vivo validate our findings: www.biorxiv.org/content/10.1...
Reposted by Niklas Kempynck
blancalorente.bsky.social
Very excited to share our new preprint together with @daniedaaboul.bsky.social, where we studied the gene regulatory code that hippocampal granule cells (GCs) use during synapse formation (1/n)
biorxiv-neursci.bsky.social
A dynamic gene regulatory code drives synaptic development of hippocampal granule cells https://www.biorxiv.org/content/10.1101/2025.03.27.645686v1
Reposted by Niklas Kempynck
kaessmannlab.bsky.social
How does gene regulation shape brain evolution? Our new preprint dives into this question in the context of mammalian cerebellum development! rb.gy/dbcxjz
Led by @ioansarr.bsky.social, @marisepp.bsky.social and @tyamadat.bsky.social, in collaboration with @steinaerts.bsky.social
Reposted by Niklas Kempynck
asapresearch.parkinsonsroadmap.org
The latest Discover ASAP episode dives into "Cell Type Directed Design of Synthetic Enhancers," a study published in Nature by CRN Team Voet. They discuss how machine learning enables precise enhancer design for targeted gene expression 🧬

Watch: www.youtube.com/watch?v=Qcms...
Reposted by Niklas Kempynck
steinaerts.bsky.social
This has been a fantastic adventure - to capture the genomic regulatory code underlying brain cell types (using deep learning models trained on chromatin accessibility), and then use these models to compare cell types between the bird and mammalian brain
niklaskemp.bsky.social
Just very happy to have our paper out today! A big thanks to all our co-authors, and to Nikolai and @steinaerts.bsky.social for the teamwork over the past years. If you are interested in using our models for cross-species enhancer studies, check out crested.readthedocs.io/en/stable/mo... 🙂
vibai.bsky.social
In a new study, Nikolai Hecker, Niklas Kempynck et al. in the team of @steinaerts.bsky.social explore 300 million years of brain evolution through the lens of enhancer codes.
www.science.org/doi/10.1126/...