Vitalii Kleshchevnikov, PhD
@vitaliikl.bsky.social
990 followers 950 following 380 posts
Researcher @bayraktar_lab @teichlab @steglelab.bsky.social @sangerinstitute.bsky.social | Using models & AI to study cells, cell circuits & brains 🧠 | #SingleCell+spatial | 🌍+🇺🇦
Posts Media Videos Starter Packs
vitaliikl.bsky.social
How many papers are truly irreproducible and how many show a possible causal path which often isn’t the limiting factor and is conditional on important context (eg mutations, genetics, environment of the experiment)?

Thinking about omnigenic model, “mechanistic bias” in medicine, personalised PRS.
Reposted by Vitalii Kleshchevnikov, PhD
labwaggoner.bsky.social
Hepatic acetyl-CoA metabolism modulates neuroinflammation and depression susceptibility via acetate @cp-cellmetabolism.bsky.social
www.cell.com/cell-metabol...
vitaliikl.bsky.social
Congratulations on getting this out🎉!

It is really cool that it seems to be possible to learn important non-coding variation from cell type agnostic data - ei can you find sequences that are CREs/enhancers at least is some cell type in the body without collecting the data from all cell types?
vitaliikl.bsky.social
Very impressive work and interesting modelling ideas 💡:
yun-s-song.bsky.social
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
Reposted by Vitalii Kleshchevnikov, PhD
yun-s-song.bsky.social
By training GPN-Star on vertebrate, mammal, and primate alignments, we reveal task-dependent advantages of modeling deeper versus more recent evolution. These findings offer new biological insights and practical guidance for developing future gLMs and evolutionary models.
(6/n)
Reposted by Vitalii Kleshchevnikov, PhD
anshulkundaje.bsky.social
This is truly an incredible breakthrough IMO. Really exemplifies what you get when deep domain expertise (popgen/evolution/disease genetics in this case) fuses with cleverly crafted ML. What u get r sleek, well thought out architectures that absolutely destroy the behemoths. Wow!! 1/
yun-s-song.bsky.social
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
vitaliikl.bsky.social
Ideally of course - like work laptop, email and slack - the noise cancelling headphones 🎧 will be a part of the starter package 📦.
vitaliikl.bsky.social
£40 for earplugs sounds ridiculous but they are infinitely reusable and very comfortable.

£200-300 for headphones sounds pretty expensive but lost energy, effort and productivity is ultimately much more expensive.

If you start your PhD next month get both ASAP in the first 1-3 months.
vitaliikl.bsky.social
I also recommend loop earplugs especially for sleep but they are less functional for work because you can still hear people through them - whereas noise cancelling headphones music can pretty much mute everything else.
vitaliikl.bsky.social
If you feel like you need noise canceling headphones 🎧 - you probably actually need to buy them right now. Don’t delay this decision and loose time trying to operate without them.

Noise cancelling headphones are like your work email+slack - you may be able to work without them but it’s pretty hard.
vitaliikl.bsky.social
Has anyone ever seen a house/flat in the UK/Cambridge that had an HRV or ERV system?

If you can’t have windows open at night due to security concerns how are you supposed to ventilate?

I tried closing windows and even with the fan on max CO2 levels never got below 850ppm (bad for sleep quality).
Reposted by Vitalii Kleshchevnikov, PhD
sebastiancachero.bsky.social
With unprecedented 38× coverage, our atlas resolves single neuron clusters and shows that separate waves of neurogenesis use different modes of molecular identity encoding: discrete in early born and continuous in late born. 3/8
Reposted by Vitalii Kleshchevnikov, PhD
jefferis.bsky.social
Neuronal diversity is written in transcriptional codes 🧬. But what is the logic of these codes that define cell types and wiring patterns?
To find out we built a #scRNAseq developmental atlas of the Drosophila nerve cord and linked it to the #connectome 🪰🧠
#preprint thread ⬇️1/8
ALT text: A UMAP representation of a single cell RNAseq dataset from the Drosophila ventral nerve cord as well as images of the Drosophila nerve cord connectome and different stages of fly development.
vitaliikl.bsky.social
Soft equating the two tasks leads to a number of problems

* overestimated complexity of perturbation prediction problem
* overlooking other tasks needed for reversing disease state
* overoptimistic assessment of success for reversing disease state based on success of perturbation prediction
vitaliikl.bsky.social
That task requires substantial additional input into the models and deeper considerations of assumptions (eg whether mechanisms used to drive change are reversible) than a more narrowly specified task of perturbation prediction.
vitaliikl.bsky.social
An example of using biological terms at a wrong level of abstraction is perturbation prediction where a paper by @const-ae.bsky.social recently showed that DL doesn’t outperform linear baselines. The real task is predicting how to reverse disease associated decisions to switch on/off specific genes.
vitaliikl.bsky.social
Ended up coming back to these points to articulate state of the GRN field in the paper/fellowship introduction. Introductions need to be specific about the task specification and model-task mismatch in existing work. Task specification need to match biology terms at the right level of abstraction.
vitaliikl.bsky.social
It’s quite hard to systematise where models can go wrong but here is an attempt:

1. Wrong math
2. Wrong mapping of math to code (or vice versa)
3. Correct math and code but wrong interpretation given in text: code solves one task but paper presents it as another task
4. Poorly understood tasks
5. …
vitaliikl.bsky.social
This phrase was coined in 1976 when the models were a lot simpler than the models published now.

Linear mixed models and PCA can be described this way because as general methods they are wrong in less complex ways.

Specialised models can be wrong in quite misleading ways as well as useless.
Reposted by Vitalii Kleshchevnikov, PhD
saezlab.bsky.social
👀 we are involved in projects 1 & 6 on the list:

1️⃣ Multicellular molecular characterisation of inflammatory bowel disease
6️⃣ Designing Efficient Single-Cell Perturbation Experiments with Lab-in-the-Loop AI Agents

📅 Application deadline is 30th September ❗
ebi.embl.org
Recruitment is now open for the EMBL-EBI–Sanger Postdoctoral Programme.

ESPOD builds on the collaborative relationship between EMBL-EBI and the @sangerinstitute.bsky.social, offering projects that combine experimental and computational approaches.

www.ebi.ac.uk/research/pos...

🧬🖥️🔬#postdocjob
ESPOD, the EMBL-EBI/Sanger postdoctoral fellowship. Applications deadline 30 September. Image credit: Karen Arnott / EMBL-EBI
Reposted by Vitalii Kleshchevnikov, PhD
ebi.embl.org
Welcome @timcoorens.bsky.social ‪ ‬🇳🇱, our new Research Group Leader.

Find out how Tim’s group is exploring using large-scale single-cell and spatial data to trace cell lineages, understand cancer origins, and uncover how mutations drive disease.

www.ebi.ac.uk/about/news/p...
🖥️🧬
Welcome: Tim Coorens
EMBL-EBI’s newest Research Group Leader is investigating how somatic mutations reveal the hidden histories of human cells.
www.ebi.ac.uk
vitaliikl.bsky.social
Just saw this talk yesterday youtu.be/fHWFF_pnqDk?...

Looks like it’s exactly what they recommend - only use it when you can verify easily and also in parts of the codebase that are not likely to be used elsewhere. My interpretation - fine for results visualisation plots but not model code.
Vibe coding in prod
YouTube video by Anthropic
youtu.be
vitaliikl.bsky.social
Posting what you think about political issues on these public platforms is not a good idea. Too much risk. Maybe it’s fine for some people but not for everyone.

So you see 10-20% of science and can only say extremely general things about politics. Not very productive use of time 🕰️.
vitaliikl.bsky.social
I also lived in Ukraine incl during 2013-2015. It’s not productive to see 90% about the current political events - it increases stress levels with only reduction in the ability to solve problems that are in your power to solve.

It should be an option to train algorithms to show a more balanced %.
vitaliikl.bsky.social
OFC it’s important for those discussing it. But say you were invited to a daily meeting about distal regulation, then 45/60 min of every meeting was spent on US politics. It’s not helpful to see 45 minutes per day of US politics unless you can do something about it (ei it feeds into your decisions).