Jamie Cummins
@jamiecummins.bsky.social
2.7K followers 660 following 870 posts
Currently a visiting researcher at Uni of Oxford. Normally at Uni of Bern. Meta-scientist building tools to help other scientists. NLP, simulation, & LLMs. Creator and developer of RegCheck (https://regcheck.app). 1/4 of @error.reviews. 🇮🇪
Posts Media Videos Starter Packs
Pinned
jamiecummins.bsky.social
Introducing RegCheck: a tool which uses Large Language Models to automatically compare preregistered protocols with their corresponding published papers and highlights deviations.

@malte.the100.ci @ianhussey.bsky.social @ruben.the100.ci @bjoernhommel.bsky.social

regcheck.app
RegCheck.app
RegCheck is an AI tool to compare preregistrations with papers instantly.
regcheck.app
jamiecummins.bsky.social
thanks for the mention! 😊
Reposted by Jamie Cummins
cghlewis.bsky.social
Issue 16 of RDM Weekly is out! 📬

It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!

rdmweekly.substack.com/p/rdm-weekly...
RDM Weekly - Issue 016
A weekly roundup of Research Data Management resources.
rdmweekly.substack.com
Reposted by Jamie Cummins
epopppp.bsky.social
Interesting article/paper.

I'm much less anti-AI than a lot of people on my feed. But pretty skeptical it can simulate human behavior effectively for social scientific purposes -- at least in cases where variation among humans, rather than acting like an average human, is what's important.
AI-generated ‘participants’ can lead social science experiments astray, study finds
Data produced by “silicon samples” depends on researchers’ exact choice of models, prompts, and settings
www.science.org
jamiecummins.bsky.social
WCL winners, they’ll never sing that 😉
Reposted by Jamie Cummins
ianhussey.mmmdata.io
My article "Data is not available upon request" was published in Meta-Psychology. Very happy to see this out!
open.lnu.se/index.php/me...
LnuOpen | Meta-Psychology
open.lnu.se
jamiecummins.bsky.social
I’ve seen @malte.the100.ci recently using one that looked very cool
jamiecummins.bsky.social
OMG I can’t wait to listen!
Reposted by Jamie Cummins
scientificdiscovery.dev
New episode of HARD DRUGS!

AlphaFold, ProteinMPNN & other AI tools are transforming biology and drug design.

But how do they work? What can’t they do? And can we use them to make a vaccine against Strep A for the very first time?

In this episode, Jacob and I talk about hacking proteins with AI.
Hacking proteins with AI
open.spotify.com
Reposted by Jamie Cummins
bpaassen.bsky.social
@cathleenogrady.bsky.social has just published the story "AI-generated ‘participants’ can lead social science experiments astray, study finds" for Science. It is, once more, a reason to be careful when relying on LLM-generated data in empirical research. www.science.org/content/arti...
AI-generated ‘participants’ can lead social science experiments astray, study finds
Data produced by “silicon samples” depends on researchers’ exact choice of models, prompts, and settings
www.science.org
jamiecummins.bsky.social
Looking forward to reading this, and I’m glad you’ve written it!
Reposted by Jamie Cummins
cchapman.bsky.social
Excellent 🧵 about LLM synthetic data (silicon samples etc) and why they don't solve any particular problem in human research.

FWIW, in addition to results and considerations like these, I've argued elsewhere that the entire question is ill-formed: quantuxblog.com/synthetic-su...
jamiecummins.bsky.social
There isn't really a fixed term tbh, people use a few different ones depending on field/domain/preference. Silicon samples seems to be the most common but there are a bunch of others, like synthetic samples/synthetic participants/etc.
jamiecummins.bsky.social
Clearly I missed my true career-calling as a diplomat lol
jamiecummins.bsky.social
OMG. Did not catch this one during my lit review. Wow.
jamiecummins.bsky.social
Starting to feel eerily like Severance....
jamiecummins.bsky.social
that should have been my full abstract!
Reposted by Jamie Cummins
lorak.bsky.social
👀 studying real humans better for understanding humans than not
jamiecummins.bsky.social
@science.org just dropped a story covering this preprint! Check it out below, and thanks to @cathleenogrady.bsky.social for the great write-up! www.science.org/content/arti...
jamiecummins.bsky.social
@science.org just dropped a story covering this preprint! Check it out below, and thanks to @cathleenogrady.bsky.social for the great write-up! www.science.org/content/arti...
jamiecummins.bsky.social
Yeah this paper was hugely inspirational for me!
Reposted by Jamie Cummins
statsepi.bsky.social
Science is grounded in observation. Measurement is a tool for observation. Measurements should be evaluated for validity and reliability/uncertainty. Scientists who use measurements without understanding their properties are not really scientists at all.
jamiecummins.bsky.social
Can large language models stand in for human participants?
Many social scientists seem to think so, and are already using "silicon samples" in research.

One problem: depending on the analytic decisions made, you can basically get these samples to show any effect you want.

THREAD 🧵
The threat of analytic flexibility in using large language models to simulate human data: A call to attention
Social scientists are now using large language models to create "silicon samples" - synthetic datasets intended to stand in for human respondents, aimed at revolutionising human subjects research. How...
arxiv.org
Reposted by Jamie Cummins
dingdingpeng.the100.ci
A lot of psych is already conducted with online convenience samples & ppl are probably excited about silicon samples bc it would allow them to crank out more studies for even less 💸

How about we reconsider the idea that sciencey science involves collecting own data.
www.science.org/content/arti...
AI-generated ‘participants’ can lead social science experiments astray, study finds
Data produced by “silicon samples” depends on researchers’ exact choice of models, prompts, and settings
www.science.org
jamiecummins.bsky.social
Forget running DOOM on your calculator; someone created a 5 million parameter language model in Minecraft. www.youtube.com/watch?v=VaeI...
I built ChatGPT with Minecraft redstone!
YouTube video by sammyuri
www.youtube.com