Clintin Davis-Stober
@clintin.bsky.social
2.4K followers 2.1K following 49 posts
Professor, quantitative psychology, decision theory, data science, mathematics, statistics, open science, modeling, weight lifting, photography, enjoyer of poetry www.davis-stober.com
Posts Media Videos Starter Packs
Reposted by Clintin Davis-Stober
richarddmorey.bsky.social
Simonsohn has now posted a blog response to our recent paper about the poor statistical properties of the P curve. @clintin.bsky.social and I are finishing up a less-technical paper that will serve as a response. But I wanted to address a meta-issue *around* this that may clarify some things. 1/x
datacolada.bsky.social
Would p-curve work if you dropped a piano on it?
datacolada.org/129
PIano being dropped on car in car testing facility
Reposted by Clintin Davis-Stober
devezer.bsky.social
I love this post about science and metascience. A lot of quotables but I’ll lead with this:

“Those seeking a scientific method – one that can be written down and followed mechanically […] – betray a kind of childish impatience with a process they clearly don’t understand.”
Reposted by Clintin Davis-Stober
carlbergstrom.com
1. "'Trusting the experts is not a feature of either a science or democracy," Kennedy said."

It's literally a vital feature of both science and of representative democracy.

I've written a fair bit about trust in expertise as a vital mechanism in the collective epistemology of science.
RFK Jr. in interview with Scripps News: ‘Trusting the experts is not science’
HHS Secretary RFK Jr. sat down with Scripps News for a wide-ranging interview, discussing mRNA vaccine funding policy changes and a recent shooting at the Centers for Disease Control and Prevention.
www.scrippsnews.com
Reposted by Clintin Davis-Stober
clintin.bsky.social
Definitely something worth digging into. I’ll give it some thought
Reposted by Clintin Davis-Stober
devezer.bsky.social
here was our call for methodological standards in metaresearch four years ago. instead of getting fixated on a particular inference we'd like to make, we need to maintain scientific standards, do the hard work, respect the evidence. we can't keep jumping at self-serving solutions without question.
The case for formal methodology in scientific reform | Royal Society Open Science
Current attempts at methodological reform in sciences come in response to an overall lack of rigor in methodological and scientific practices in experimental sciences. However, most methodological ref...
share.google
Reposted by Clintin Davis-Stober
devezer.bsky.social
some of the discussion around the p-curve paper is depressing. i still see many missing the point clearly stated in the conclusion, and instead of demanding strong standards or questioning whether they're even asking good questions, they've immediately started asking for replacement methods.
clintin.bsky.social
I would add that while pcurve is comprised of tests, that indeed correspond to error rates, the actual hypotheses being tested have little to do with “evidential value”
clintin.bsky.social
Test statistics being used are just simple sums, no 3rd moment information enters the test.
clintin.bsky.social
Would this be grounds for dismissing the remaining 54 studies as lacking value? It makes no sense. Part of the problem is that the original p curve papers aren’t clear on what exactly is being tested. The authors claim they are tests of skew, but this is incorrect as the
clintin.bsky.social
Happy to clarify. Pcurve is used to test whether a set of studies have (or lack) “evidential value” (which is not really defined). But the actual hypotheses being tested by pcurve don’t permit this, as Richard and I show. Suppose one study WAS underpowered in a set of 55 studies -
clintin.bsky.social
All pcurve tests are just a simple sum of transformed pvalues. There is a fundamental disconnect between the null hypotheses being tested by p-curve and the claims being made.
clintin.bsky.social
What this means is that a significant result for either test only allows one to claim that “at least one” study (out of the set) doesn’t have the property being considered. Why does this happen? Because pcurve completely ignores the configuration of the pvalues being considered.
clintin.bsky.social
The test for evidential value simply examines whether the effect size is zero for all studies. The test for lack of evidential value tests whether all studies are “underpowered”, I.e., have small non-centrality parameters.
clintin.bsky.social
The developers of p-curve claim that p-curve can be used to make claims about the evidential value (or lack thereof) of whole sets of studies. We show that the actual hypotheses being tested do not allow for such strong conclusions.
clintin.bsky.social
The basic idea of p-curve rests on the idea that the skew of a set of p-values is informative about whether QRPs are occurring. As we show, the p-curve tests have nothing to do with skew. It is trivial to create left skewed pvalues that p-curve would confidently label as right skewed.
clintin.bsky.social
New paper with @richarddmorey.bsky.social now out in JASA, where we critically examine p-curve. Below is Richard’s excellent summary of the many poor statistical properties of p-curve (with link to paper). I wanted to add some conceptual issues that we also tackle in the paper.
richarddmorey.bsky.social
Paper drop, for anyone interested in #metascience, #statistics, or #metaanalysis! @clintin.bsky.social and I show in a new paper in JASA that the P-curve, a popular forensic meta-analysis method, has deeply undesirable statistical properties. www.tandfonline.com/doi/full/10.... 1/?
Cover page for the manuscript: Morey, R. D., & Davis-Stober, C. P. (2025). On the poor statistical properties of the P-curve meta-analytic procedure. Journal of the American Statistical Association, 1–19. https://doi.org/10.1080/01621459.2025.2544397 Abstract for the paper: The P-curve (Simonsohn, Nelson, & Simmons, 2014; Simonsohn, Simmons, & Nelson, 2015) is a widely-used suite of meta-analytic tests advertised for detecting problems in sets of studies. They are based on nonparametric combinations of p values (e.g., Marden, 1985) across significant (p < .05) studies and are variously claimed to detect “evidential value”, “lack of evidential value”, and “left skew” in p values. We show that these tests do not have the properties ascribed to them. Moreover, they fail basic desiderata for tests, including admissibility and monotonicity. In light of these serious problems, we recommend against the use of the P-curve tests.
Reposted by Clintin Davis-Stober
danielheck.bsky.social
New paper by my PhD student @semihaktepe.bsky.social now published 🚀

"Revisiting the effect of discrepant perceptual fluency on truth judgments" tinyurl.com/2eepue5y

--> Two experiments & a meta-analysis indicate that high visual contrast does not lead to higher truth judgments.
clintin.bsky.social
I'm so sorry this happened to you. There is no excuse for such bs.
Reposted by Clintin Davis-Stober
Reposted by Clintin Davis-Stober
danielheck.bsky.social
🚀Postdoc position @unimarburg.bsky.social in the project:

"Bridging the Gap Between Verbal Psychological Theories & Formal Statistical Modeling with Large Language Models"
(funded by @volkswagenstiftung.de)

📅Start: 01.10.2025 | ⏳4 years
🔗 Apply now: uni-marburg.de/jhbCen
🔄 Thanks for sharing!
Postdoc
uni-marburg.de
Reposted by Clintin Davis-Stober
irisvanrooij.bsky.social
NEW paper! 💭🖥️

“Combining Psychology with Artificial Intelligence: What could possibly go wrong?”

— Brief review paper by @olivia.science & myself, highlighting traps to avoid when combining Psych with AI, and why this is so important. Check out our proposed way forward! 🌟💡

osf.io/preprints/ps...
Table 1
Typology of traps, how they can be avoided, and what goes wrong if not avoided. Note that all traps in a sense constitute category errors (Ryle & Tanney, 2009) and the success-to-truth inference (Guest & Martin, 2023) is an important driver in most, if not all, of the traps.
Reposted by Clintin Davis-Stober
richarddmorey.bsky.social
If you'll be at APS2025 in DC next week, I'll be talking about the terrible statistical properties of the p curve procedure in the "Current Issues in Meta-Science" session Saturday, May 24th at 3pm. This will likely be my last US conference in a very long time. A brief summary follows. 1/
Screenshot of the APS program for "Current Issues in Meta-Science", Saturday, May 24, 3:00pm Titles and authors in the session:

* Statistical Power in the Light of Methodological Reform (Jolynn Pek)

* The Poor Statistical Properties of the P-Curve Procedures (Richard Morey)

* Consistent Methods Protect Against False Findings Produced By p Hacking (Duane Wegener)

* Accumulating Evidence across Studies (Blakeley McShane)