Lightnews — Scholar-powered news

Markus List @itisalist.bsky.social · 20d

We're getting ready for Maustag 2025 www.mdsi.tum.de/en/mdsi/late... at the @tum.de Munich Data Science Institute, where we @daisybio.de plan to show children why AI and bioinformatics are important for studying the code of life. I dare say that our first practice session went quite well :-)

6

Reposted by Markus List

Stephen Turner @stephenturner.us · 23d

Comprehensive benchmark of differential transcript usage analysis for bulk and single-cell RNA sequencing academic.oup.com/nargab/artic... 🧬🖥️🧪

3 7

Markus List @itisalist.bsky.social · 28d

And for those who enjoyed our tutorial on network medicine and drug repurposing: if you are spontaneous, consider joining us for the RExPO conference organized by @repo4eu.bsky.social repo4.eu/rexpo25/ later this month.

2 2

Markus List @itisalist.bsky.social · 28d

I enjoyed visiting the BC2 conference: cool talks,
great networking, beautiful location. Thanks to the organizers at @sib.swiss I appreciated especially today's session on startups in bioinformatics, that was insightful.

2 2 4

Markus List @itisalist.bsky.social · Sep 7

En route to Basel for the @sib.swiss #BC2 conference. We're contributing to a workshop on 🕸️ network medicine and 💊 drug repurposing, with tools developed in @repo4eu.bsky.social incl. drugst.one for which we've just released the DREAM extension doi.org/10.58647/DRU... simplifying expert annotation.

2 5

Reposted by Markus List

REPO4EU @repo4eu.bsky.social · Sep 3

#RExPO25 Speakers | S7: AI/ML in #SystemsMedicine & #DrugRepurposing

🟣 @itisalist.bsky.social & Lisa Spindler (@daisybio.de)

🟣 Jan Baumbach & Fernando Delgado Chavez (@cosybio-uhh.bsky.social)

Check the full conference agenda ⤵️
repo4.eu/rexpo25/agen...

🇪🇺 #EUfunded #DrugRepurposing

RExPO25 Session in focus: AI/ML in systems medicine and drug repurposing.

3 3

Reposted by Markus List

Gregor Sturm @grst.bsky.social · Aug 13

Our benchmark + guidelines for atlas-level differential gene expression of single cells is online:

academic.oup.com/bib/article/...

Bottom line: Use pseudobulk + DESeq2 in simple and pseudobulk + DREAM in more complex settings.

Collab w/ @leonhafner.bsky.social @itisalist.bsky.social

1 3 11

Markus List @itisalist.bsky.social · Jun 28

A flowery surprise at our @tum.de campus Freising yesterday. Congratulations to all students who celebrated their graduation.

1 1

Markus List @itisalist.bsky.social · Jun 5

Yes indeed, are you here as well :-)

1

Markus List @itisalist.bsky.social · Jun 5

En route to visit the @cosybio-uhh.bsky.social lab in Hamburg who are kindly organizing the latest @repo4eu.bsky.social WP2 workshop. Looking forward to discussing the refinement of our computational pipelines for drug repurposing. Hope the train will not be too much delayed...

1 1 1

Markus List @itisalist.bsky.social · Jun 3

🧬🖥️Drug response prediction is a machine learning challenge with immense potential for precision medicine. Our latest preprint introduces DrEval, a comprehensive benchmarking framework to evaluate state-of-the-art methods, uncover widespread issues, and guide the development of more robust models.

Judith Bernett @judith-bernett.bsky.social · Jun 3

🧬🖥️So excited to show you the outcome of @pascivers.bsky.social and my latest project: "From Hype to Health Check: Critical Evaluation of Drug Response Prediction Models with DrEval" doi.org/10.1101/2025.05.26.655288, published with M. Picciani, M. Wilhelm, K. Baum & @itisalist.bsky.social.
🧵1/10

Overview of the DrEval framework. Via input options, implemented state-of-the-art models can be compared against baselines of varying complexity. We address obstacles to progress in the field at each point in our pipeline: Our framework is available on PyPI and nf-core and we follow FAIReR standards for optimal reproducibility. DrEval is easily extendable as demonstrated here with a pseudocode implementation of a proteomics-based random forest. Custom viability data can be preprocessed with CurveCurator, leading to more consistent data and metrics. DrEval supports five widely used datasets with application-aware train/test splits that enable detecting weak generalization. Models are free to use provided or custom cell line– and drug features. The pipeline supports randomization-based ablation studies and performs robust hyperparameter tuning for all models. Evaluation is conducted using meaningful, bias-resistant metrics to avoid inflated results from artifacts such as Simpson’s paradox. All results are compiled into an interactive HTML report. Created in https://BioRender.com.

5 9

Markus List @itisalist.bsky.social · May 28

Had a great time in Innsbruck. The scenery here with the mountains in the background is always impressive, even when the weather is not so nice. Thanks @francescafinotello.bsky.social for inviting me!

1 3

Markus List @itisalist.bsky.social · May 27

For those of you who are not in Innsbruck to see me today, you might instead listen to @judith-bernett.bsky.social at the @iscb.bsky.social NetBio webinar!

🔗 Attend at ISCB Nucleus: iscb.junolive.co

📍 If you’re not an ISCB member, register for access to ISCB Nucleus: lnkd.in/gMhrKGJz

1 2

Markus List @itisalist.bsky.social · May 27

I believe the seminar is offline only. My focus is not data privacy (also very important!), but on inflated performance estimates due to methods learning illegitimate shortcuts. If you'd like to know more, we have written a perspective article with guiding questions: www.nature.com/articles/s41...

Guiding questions to avoid data leakage in biological machine learning applications - Nature Methods

This Perspective discusses the issue of data leakage in machine learning based models and presents seven questions designed to identify and avoid the problems resulting from data leakage.

www.nature.com

1 1

Markus List @itisalist.bsky.social · May 27

Traveling to Innsbruck by invitation of @francescafinotello.bsky.social to talk about data leakage, a widespread issue in biomedical machine learning applications. I'll talk about challenges in protein-protein interaction (doi.org/10.1093/bib/...) and drug response prediction (upcoming preprint!).

Colorful liquid flowing from one bottle into another, as an illustration for (data) leakage.

1 1 5

Reposted by Markus List

Victor Javier @vjsanchez.bsky.social · Apr 30

Benchmarking algorithms for spatially variable gene identification in spatial transcriptomics 🧬🖥️ academic.oup.com/bioinformati...

Benchmarking algorithms for spatially variable gene identification in spatial transcriptomics

AbstractMotivation. The rapid development of spatial transcriptomics has underscored the importance of identifying spatially variable genes. As a fundament

academic.oup.com

1 7

Reposted by Markus List

DaiSyBio @daisybio.de · Apr 3

Weihenstephan Bioinformatics Symposium 2025: More than 75 scientists from Bavaria and the world came together to share talks and create new synergies. It was great to host this event, for those who missed it: The next edition is planned for 2027 😉

1 2 8

Reposted by Markus List

Ana Conesa @anaconesa.bsky.social · Mar 28

I am so happy to see this manuscript finally out!!! We review and discuss all analysis steps in long reads transcriptomics. Hope the community finds this useful! Hugo thanks to @carolinamonzo.bsky.social and @tianyuanliu.bsky.social for the huge work!!! @longtrec.bsky.social @hitseq.bsky.social

Nature Reviews Genetics @natrevgenet.nature.com · Mar 28

Transcriptomics in the era of long-read sequencing go.nature.com/421ZTJm #Review by @carolinamonzo.bsky.social, Tianyuan Liu & @anaconesa.bsky.social @conesalab.bsky.social @i2sysbio.bsky.social

Transcriptomics in the era of long-read sequencing - Nature Reviews Genetics

Advances in long-read sequencing are driving the implementation of these technologies for transcriptome profiling. The authors provide a comprehensive guide to long-read RNA sequencing, including expe...

go.nature.com

2 11 27

Reposted by Markus List

DaiSyBio @daisybio.de · Mar 20

Greetings from Palermo! @en-coding.bsky.social, @a-dietrich.bsky.social, @itisalist.bsky.social, Serafina Reif, Nico Trummer & Kamila Kwiecien are united here at the occasion of the MyeInfoBank COSTAction: Converting Molecular Profiles of Myeloid Cells into Biomarkers for Inflammation and Cancer

2 3

Markus List @itisalist.bsky.social · Mar 3

It was a pleasure having you, Ryu! Thanks so much for your talk and visit.

1

Reposted by Markus List

DaiSyBio @daisybio.de · Feb 13

📍Welcome to our presentation round of the DaiSyBio members! Every week, you will get to know someone from our lab.
The start is done by @itisalist.bsky.social who heads the group. Markus joined TUM in 2018 and became a W2 tenure track associate professor in 2023. More members are about to follow! 📍

2 8

Markus List @itisalist.bsky.social · Jan 31

By the way, we also have a lab account on bluesky now. Follow @daisybio.de to get more news about our activities. We are also on LinkedIn: www.linkedin.com/company/dais...

DaiSyBio | LinkedIn

DaiSyBio | 323 followers on LinkedIn. DaiSyBio is a research group created in December 2023 as part of the TUM School of Life Sciences. Cutting-edge expertise in biology, medicine and computer scienc...

www.linkedin.com

1 2

Reposted by Markus List

DaiSyBio @daisybio.de · Jan 30

Congrats to our PhD student Johannes Kersting for winning the ASAPbio poster prize for his contribution to the @repo4eu.bsky.social #REXPO24: blog.scienceopen.com/2025/01/joha...

Johannes Kersting Awarded an ASAPBio Poster Prize for RExPO Contribution

The ASAPBio poster competition is an initiative that champions open science by highlighting innovative research and the early sharing of results. This year’s competition emphasized transparency, colla...

blog.scienceopen.com

2 7

Markus List @itisalist.bsky.social · Jan 27

We followed up on our previous work, where we showed that predicting protein-protein interactions from sequence alone yields random performance when data leakage is accounted for. In this new preprint, we show that ESM2 embeddings raise the bar to 0.65 accuracy independent of the model architecture.

Judith Bernett @judith-bernett.bsky.social · Jan 27

🧬🖥️ Proud to share our latest update on PPI predictions – "Deep learning models for unbiased sequence-based PPI prediction plateau at an accuracy of 0.65" doi.org/10.1101/2025... by T. Reim, published with @itisalist.bsky.social @dbblumenthal.bsky.social, A. Hartebrodt, and me. What did we do? 1/15 🧵

Graphical summary of the analyses done in the publication displayed on six panels a-f. (a) We computed ESM-2 embeddings of different sizes for the proteins of our data-leakage-free PPI dataset. The per-token embeddings have variable sizes depending on the protein length, while the per-protein embeddings have a fixed size by applying dimension-wise averaging. (b) We tested two models operating on the per-protein embeddings—a baseline random forest classifier and adaptions of the previously published Richoux model. Five models operated on the per-token embeddings: a 2d-baseline, the 2d-Selfattention and 2d-Crossattention models (which expanded the 2d-baseline through a Transformer encoder), and adaptations of the published models D-SCRIPT and TUnA. (c) Hyperparameter tuning gave us insight into the influence of each tunable parameter on the classification performance. (d) No model surpassed an accuracy of 0.65. The more advanced models had similar accuracies, leading us to believe that the information content of the ESM-2 embedding has more influence than the model architecture. Per-token models did not consistently outperform per-protein models. (e) We applied various modifications to test their influence: different embedding sizes, inserting a Transformer encoder into different positions, adding spectral normalization after the linear layers, self- vs. cross-attention, and removing the padding. (f) Finally, we compared the implicitly predicted distance maps of the 2d-baseline, 2d-Selfattention, 2dCrossattention, and D-SCRIPT-ESM-2 to real distance maps computed from PDB structures.

2

Markus List @itisalist.bsky.social · Jan 23

Hey young investigators, use this opportunity to learn about the COST action MyeInfoBank and listen to a nice talk by our fabulous @a-dietrich.bsky.social

DaiSyBio @daisybio.de · Jan 23

📣 Shoutout to all young molecular biologists, bioinformaticians and immunobiologists: Don't miss our lab member Alexander Dietrich giving a talk on cell-type deconvolution next THU at 2pm 🕑
It's free, it's virtual, it's new! Register here ⤵️
bit.ly/3VI6Dtx

1 1