Luke Sanford
@lcsanford.bsky.social
7.1K followers 2.5K following 110 posts
Assistant Prof. at Yale School of the Environment. Political economy of climate and environment, land use change, remote sensing, causal ML. https://sanford-lab.github.io/
Posts Media Videos Starter Packs
Reposted by Luke Sanford
didacqueralt.bsky.social
You can read find more about details at www.journals.uchicago.edu/doi/10.1086/... Thanks to all that helped along the way!
lcsanford.bsky.social
Omg I did this in graduate school and it was the worst.
lcsanford.bsky.social
Met @annaleen.bsky.social and got to spend some good time talking science and spec-fic and they were even more amazing and brilliant than I expected. They had just thought deeply about such interesting and important things and knew how to say things in exactly the right way.
ericmgarcia.bsky.social
Screw "never meeting your heroes." What were some times you met your heroes and they were super cool?
Reposted by Luke Sanford
drewstommes.bsky.social
If your research involves RD designs, check out this important new working paper from Ghosh, Imbens, and Wager: "PLRD: Partially Linear Regression Discontinuity Inference" arxiv.org/pdf/2503.09907
Reposted by Luke Sanford
vickyharp.com
Watching Jurassic Park, where a computer nerd with a debt problem and delusions of grandeur tears down all the safety systems, with no understanding of the consequences, so he can better facilitate his planned espionage and theft.
Reposted by Luke Sanford
raskin.house.gov
If Vladimir Putin had a plan to foul our air and water, wreck public health and drive America over the cliff of irreversible lethal climate change, it would look exactly like Lee Zeldin’s plan. This is a plan for self-inflicted environmental disaster.
www.theguardian.com/us-news/2025...
Trump’s environmental rule-shredding will put lives at risk, ex-EPA heads say
Former agency leaders, including two Republicans, say rollbacks by Lee Zeldin could cause ‘severe harms’
www.theguardian.com
lcsanford.bsky.social
Here's what that image was supposed to look like:
lcsanford.bsky.social
We went for roads since it's easy to see how that measurement error could arise. Often we have no idea why RS + ML errors occur
satellite scanning trees on a hillside vs trees on flat ground, observing more trees in the hill than on the flat
lcsanford.bsky.social
@bstewart.bsky.social and co-authors explore the same issue in text measurement models like LLMs and find something similar--even small measurement errors can lead to large biases in downstream causal tasks when they aren't orthogonal to treatment
lcsanford.bsky.social
Imagine you run a land tenure reform RCT, DV is tree cover. It turns out your treatment also causes more irrigated ag, which is mis-classified as treecover more often than rainfed ag (year-round greenness). Estimated treatment effect will be > that true treatment effect.
lcsanford.bsky.social
While that's our running example for the paper, definitely a broader issue here. We think assuming no correlation between measurement error and treatment is akin to the selection on observables assumption we usually require extraordinary evidence to believe. A couple examples below:
lcsanford.bsky.social
8/9
Reach out if you want to debias some measurements in a particular application!
lcsanford.bsky.social
7/9
It’s easy to plug in any causal variable that might bias your ML-driven proxy. The adversary directly leverages your labeled data—so if you’re building custom measurement models with large-scale images (or text), you just tack on the adversary, retrain, and your bias vanishes.
8/9
a cartoon of spongebob giving the thumbs up with the words too easy below him
ALT: a cartoon of spongebob giving the thumbs up with the words too easy below him
media.tenor.com
lcsanford.bsky.social
6/9
We then use a labeled forest cover data from high-resolution imagery. When comparing the ML predictions to ground-truth labels, a naive model under-estimates forest cover near roads. Our adversarial model, by contrast, recovers unbiased estimates, giving more reliable coefficients.
lcsanford.bsky.social
5/9
We induce measurement error bias in a simulation of the effect of roads on forest cover. We show that a naive model yields biased estimates of this relationship, while an adversarial model gets it right.
lcsanford.bsky.social
4/9
We also introduce a simple bias test: regress the ML prediction errors on your independent variable. If nonzero, you have measurement error bias. If you run that test while gathering ground-truth data, you can estimate how many labeled observations you’ll need to reject a target amount of bias.
lcsanford.bsky.social
3/9
Here’s how: a primary model predicts the outcome, while an adversarial model tries to predict the treatment using the prediction errors. As the adversary learns how to predict treatment, the primary model learns to make predictions where the errors contain no information about the treatment.
algorithm for an adversarial debiasing model, including the primary model, the adversarial model, and the estimation model
lcsanford.bsky.social
2/9
We are inspired by the algorithmic fairness literature. There, adversarial models force models to have balanced prediction errors across the dist. of a protected attribute (e.g. race). We adapt that approach, but instead of race, our “protected attribute” is the independent variable of interest.