Lightnews — Scholar-powered news

Maarten van Smeden @maartenvsmeden.bsky.social · 2h

Kind reminder: data driven variable selection (e.g. forward/stepwise/univariable screening) makes things *worse* for most analytical goals

2 1 12

Maarten van Smeden @maartenvsmeden.bsky.social · 13d

NEW FULLY FUNDED PHD POSITION

Looking for a motivated PhD candidate to join our team. Together with Danya Muilwijk, Jeffrey Beekman and I, you will explore opportunities and limitations of AI in the context of organoids

For more info and for applying 👉
www.careersatumcutrecht.com/vacancies/sc...

Vacancy — PhD position on AI methodology for prediction of patient outcomes using organoid models

Are you passionate about bringing personalized medicine to the next level and make real impact in healthcare? Join our team and develop novel AI methodology to improve predictions of relevant patient ...

www.careersatumcutrecht.com

1 8 8

Reposted by Maarten van Smeden

Darren Dahly @statsepi.bsky.social · Sep 22

Interpretable "AI" is just a distraction from safe and useful "AI"

1 2 10

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 19

This is right tho. Let’s therefore call them sensitivity positive predictive value curves bsky.app/profile/laur...

Adam L @lauretig.bsky.social · Jul 27

9. It's annoying how often the same model is "discovered" in a different field, with a completely different set of jargon

1 7

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 19

For details: arxiv.org/abs/2412.10288

Performance evaluation of predictive AI models to support medical decisions: Overview and guidance

A myriad of measures to illustrate performance of predictive artificial intelligence (AI) models have been proposed in the literature. Selecting appropriate performance measures is essential for predi...

arxiv.org

1 2 12

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 19

No.

Adam L @lauretig.bsky.social · Jul 27

5. You should use a precision-recall curve for a binary classifier, not an ROC curve

2 2 11

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 13

I wonder who those people are who come here dying to know what GenAI has done with some prompt you put in

1 1 5

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 12

If you think AI is cool, wait until you learn about regression analysis

5 20 120

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 11

TL;DR: Explainable AI models often don't do a good job explaining. They can be very useful for description. We should be really careful when using Explainable AI in clinical decision making, and even when judging face validity of AI models

Excellently led by @alcarriero.bsky.social

1 11

Maarten van Smeden @maartenvsmeden.bsky.social · Aug 11

NEW PREPRINT

Explainable AI refers to an extremely popular group of approaches that aim to open "black box" AI models. But what can we see when we open the black AI box? We use Galit Shmueli's framework (to describe, predict or explain) to evaluate

arxiv.org/abs/2508.05753

6 18 69

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 31

This is, however, not clever or safe writing, it is a bad collective habit that needs to stop. Not by avoiding references to causality but by clear referencing to it

pubmed.ncbi.nlm.nih.gov/37286459/

Guidelines for Reporting Observational Research in Urology: The Importance of Clear Reference to Causality - PubMed

Observational studies often dance around the issue of causality. We propose guidelines to ensure that papers refer to whether or not the study aim is to investigate causality, and suggest language to ...

pubmed.ncbi.nlm.nih.gov

1 3 9

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 31

The healthcare literature is filled with "risk factors". This word combination makes research findings sound important by implying causality, while avoiding direct claims of having identified causal associations that are easily critiqued.

2 1 24

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 24

And taking this analogy one step further: it gives genuine phone repair shops a bad name

7

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 23

When forced to make a choice, my choice will be logistic regression model over linear probability model 103% of the time

2 35

Reposted by Maarten van Smeden

Tim Morris @timpmorris.bsky.social · Jul 23

Post just up: Is multiple imputation making up information?

tldr: no.

Includes a cheeky simulation study to demonstrate the point.
open.substack.com/pub/tpmorris...

Cover picture with blog title & subtitle, and results graph in the background

3 11 40

Reposted by Maarten van Smeden

Darren Dahly @statsepi.bsky.social · Jul 14

You can have all the omni-omics data in the world and the bestest algorithms, but eventually a predicted probability is produced & it should be evaluated using well-established methods, and correctly implemented in the context of medical decision making.

statsepi.substack.com/i/140315566/...

The leaky pipe of clinical prediction models. by @maartenvsmeden.bsky.social‬ et al

4 14 38

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 11

Clients: “I want to find real, meaningful clusters”
Me: “I want world peace, which is more likely to happen than what you want”

3

Maarten van Smeden @maartenvsmeden.bsky.social · Jul 10

Depending which methods guru you ask every analytical task is “essentially” a missing data problem, a causal inference problem, a Bayesian problem, a regression problem or a machine learning problem

5 6 59

Reposted by Maarten van Smeden

Darren Dahly @statsepi.bsky.social · Jul 7

Copy of the title page of John Snow's seminal "Mode of communication of Cholera" report, with the text "A real world evidence study" in comic sans added to it.

5 6 35

Maarten van Smeden @maartenvsmeden.bsky.social · Jun 27

In medicine they are called "risk factors" and, of course, you want all "important" risk factors in your model all the time

Unless a risk factor is not statistically significant then you can drop that factor without issues

5 2 26

Reposted by Maarten van Smeden

Richard Riley (R²) @richarddriley.bsky.social · Jun 27

* New preprint led by Joao Matos & @gscollins.bsky.social

"Critical Appraisal of Fairness Metrics in Clinical Predictive AI"

- Important, rapidly growing area
- But confusion exists
- 62 fairness metrics identified so far
- Better standards & metrics needed for healthcare
arxiv.org/abs/2506.17035

5 10

Maarten van Smeden @maartenvsmeden.bsky.social · Jun 27

Also, the fact that a model with the best AUC doesn't always mean the model makes the best predictions is lost in such cases too

1 2

Maarten van Smeden @maartenvsmeden.bsky.social · Jun 27

Surprisingly common thing: comparisons of prediction models developed using, say, Logistic Regression, Random Forest and XGBoost with conclusion XGBoost is "good" because it yields slightly higher AUC than LR or RF using the same data

Fact that "better" doesn't always mean "good" seems lost

2 11

Reposted by Maarten van Smeden

Georg Heinze @georgheinze.bsky.social · Jun 26

Published: the paper 'On the uses and abuses of Regression Models: a Call for Reform of Statistical Practice and Teaching' by John Carlin and Margarita Moreno-Betancur in the latest issue of Statistics in Medicine onlinelibrary.wiley.com/doi/10.1002/... (1/8)

onlinelibrary.wiley.com

3 17 47

Maarten van Smeden @maartenvsmeden.bsky.social · Jun 17

What is common knowledge in your field, but shocks outsiders?

Validated does not mean it works as intended. It means someone has evaluated it (and may have concluded it doesn’t work at all)

Overly.Honest.Editor @editoratlarge.bsky.social · Jun 17

What is common knowledge in your field, but shocks outsiders?

We're not clear on what peer review is, at all.

Dr. Jens Foell @jensfoell.de · Jun 17

What is common knowledge in your field, but shocks outsiders?

We’re not clear on what intelligence is, at all

2 6 24