Lightnews — Scholar-powered news

Joe Ornstein @joeornstein.bsky.social · Aug 29

🚨 New R package and paper! 🚨

fuzzylink is a method for merging datasets with non-exact matches on key variables. The paper walks through several useful political science applications--linking voter files, campaign contribution data, and even multilingual records. The package is available on CRAN.

Cambridge University Press Political Science & IR @cambup-polsci.cambridge.org · Aug 28

#OpenAccess from @polanalysis.bsky.social -

Probabilistic Record Linkage Using Pretrained Text Embeddings - cup.org/41WR58i

- @joeornstein.bsky.social

#FirstView

1 4

Reposted by Joe Ornstein

Andrew Heiss @andrew.heiss.phd · Mar 20

I’ve long used FiveThirtyEight’s interactive “Hack Your Way To Scientific Glory” to illustrate the idea of p-hacking when I teach statistics. But ABC/Disney killed the site earlier this month :(

So I made my own with #rstats and Observable and #QuartoPub ! stats.andrewheiss.com/hack-your-way/

Screenshot of the linked Quarto website, with input checkboxes to change different conditions for a regression model that predicts economic performance based on US political party, with a reported p-value

58 440 1.5K

Reposted by Joe Ornstein

Political Science Research and Methods @psrm.bsky.social · Jan 20

🦜Do you want to train your stochastic parrot?

➡️ @joeornstein.bsky.social @enblasingame.bsky.social @jaketruscott.bsky.social share best practices for using large language models (LLMs) in social science measurement tasks and processing large text-as-data projects www.cambridge.org/core/journal...

2 1 17

Joe Ornstein @joeornstein.bsky.social · Jan 14

3. 🦜R package alert! 🦜

With promptr, R users can easily format and complete few-shot LLM prompts for document labeling and scaling tasks.

cran.r-project.org/web/packages...

promptr: Format and Complete Few-Shot LLM Prompts

Format and submit few-shot prompts to OpenAI's Large Language Models (LLMs). Designed to be particularly useful for text classification problems in the social sciences. Methods are described in Ornste...

cran.r-project.org

2

Joe Ornstein @joeornstein.bsky.social · Jan 14

2a. If you're a social scientist using crowd-coding platforms to label documents in 2025, you're spending 1,000 times more money to ask someone else to put your text into an LLM for you.

1 2

Joe Ornstein @joeornstein.bsky.social · Jan 14

2. When we started the project in fall 2021, our analyses cost a few hundred dollars in API fees. Today, the the same tasks would cost around $3. That's about 1,200 times cheaper than performing the same tasks on crowd-coding platforms.

1 2

Joe Ornstein @joeornstein.bsky.social · Jan 14

1. It's remarkable that, in 2025, GPT-3 still performs as well if not better than GPT-4 and its offshoots at our document labeling and scaling tasks. RLHF is great for making chatbots, but for text-as-data tasks you're often better off with the base models.

1 2

Joe Ornstein @joeornstein.bsky.social · Jan 14

Some thoughts about this paper, on the long-awaited day of its publication:

Elise Blasingame @enblasingame.bsky.social · Jan 14

🦜New Pub Alert! 🦜
In our new article @psrm.bsky.social, @joeornstein.bsky.social @jaketruscott.bsky.social and I demonstrate how LLMs (like ChaptGPT) can be used to process large text-as-data projects, like sentiment analysis, document scaling, and topic modeling. #polisky doi.org/10.1017/psrm...

1 1 3

Reposted by Joe Ornstein

Jake Truscott, PhD @jaketruscott.bsky.social · Jan 14

After what felt like ages, very(!) excited to finally see this work in print with @joeornstein.bsky.social & @enblasingame.bsky.social

Link: bit.ly/409THxY

How to train your stochastic parrot: large language models for political texts | Political Science Research and Methods | Cambridge Core

How to train your stochastic parrot: large language models for political texts

bit.ly

1 4

Reposted by Joe Ornstein

alex hayes @alexpghayes.com · Dec 12

there are two profs on my committee who don't respond to emails so i've started turning my subject lines into research clickbait and it's criminally effective

drake meme. top panel: subject: scheduling committee meeting. bottom panel: YOUR FAVORITE ESTIMAND MIGHT BE UNIDENTIFIED IN OUR MODEL, ACT NOW TO LEARN MORE

3 70 280

Joe Ornstein @joeornstein.bsky.social · Oct 5

New website just dropped! joeornstein.github.io

Remarkable how much easier it is to build a site with Quarto than my old blogdown/Hugo monstrosity.

Joseph T. Ornstein

joeornstein.github.io

4

Joe Ornstein @joeornstein.bsky.social · Sep 21

Cool. Everyone is here now.

2