Bruno Ferman
@brunoferman.bsky.social
990 followers 260 following 19 posts
Professor at Sao Paulo School of Economics - FGV Econometrics/Applied Micro/affiliate @JPAL MIT Econ PhD https://sites.google.com/site/brunoferman/home
Posts Media Videos Starter Packs
Pinned
brunoferman.bsky.social
🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!
Reposted by Bruno Ferman
julianreif.bsky.social
Good overview of what to do when there are only a few treated units
#EconSky
brunoferman.bsky.social
🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!
Reposted by Bruno Ferman
arindube.bsky.social
Very useful resource.
brunoferman.bsky.social
🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!
brunoferman.bsky.social
15/
Applied folks: we hope this serves as a warning that standard inference may fail with few treated units + guidance on choosing alternatives.
Econometricians: we wanted to provide a state-of-the-art overview — and a call for new methods based on alternative assumptions!
brunoferman.bsky.social
14/
And show some equivalences:

e.g., wild-bootstrap (with null imposed) asymptotically equivalent to sign-changes when N₁ is fixed and N₀ → ∞

⇒ theoretical justification for wild-bootstrap in these settings
brunoferman.bsky.social
13/
We also provide finite-N₀ improvements for some methods, such as Conley-Taber and sign-changes.

Free lunch: gains with finite N₀ & asymptotic equivalent when N₀ → ∞ (with N₁ fixed)
brunoferman.bsky.social
12/
What if we have >1 treated (but still few)?

More info on treated ⇒ alternatives: sign-changes, Behrens-Fisher solutions, etc

Relax some assumptions relative to previous methods (but need new ones!)
⚡Power may be an issue when N₁ is very small

Many relevant trade-offs!
brunoferman.bsky.social
11/
In these extreme cases: need to impose strong restrictions on treatment effect heterogeneity!

If interested, see discussion in Section 4.1.3 on inference on sharp nulls, inference on realized treatment effects, prediction intervals, and sensitivity analysis.
brunoferman.bsky.social
10/
📌Extrapolate from time series

Learn about treated error using pre-treatment residuals

⚡Flip assumptions
Need time series restrictions (stationarity) but relax assumptions on cross-section

Challenges arise when counterfactuals are estimated via high-dimensional approaches
brunoferman.bsky.social
9/
Ferman and Pinto (2019): allow for heteroskedasticity that can be estimated based on observables.

Example: when units have different variances due to variation in population sizes.

See this old Twitter thread: x.com/bruno_ferman...
x.com
brunoferman.bsky.social
8/
📌Extrapolate (learn) from control units

Learn the distribution of the treated error using controls' residuals (à la Conley and Taber)
⚡Key assumption: Errors of treated and control units must have the same distribution (homoskedasticity)
No restriction on time series!
brunoferman.bsky.social
7/
Survey is organized based on data availability.

📌Limit case:
One treated unit & one treated period.

Enough info from the treated to construct an estimator — but no info from the treated to learn its distribution!

⚡Solution:
We need to *extrapolate* ⇒ stronger assumptions!
brunoferman.bsky.social
6/
We focus on model-based approaches, more common in metrics

📚 Nice citation from Haavelmo to justify this framework + marvel movies to help make the point 🕷️: )

We also discuss design-based approaches at the end
brunoferman.bsky.social
5/
Important:
📌Problems arise when the *number* of treated units is small

✅Standard methods are usually fine with 40 or 50 treated units, even when the *share* of treated is small.

Feel free to cite our survey to justify sticking to standard methods when that's your case!😉
brunoferman.bsky.social
4/
Extreme case: you have only 1 treated and N₀ controls.

The true variance is σ₁² + σ₀²/N₀.

But with only one treated, you just don’t have enough info to estimate σ₁² using only the treated!

Robust SEs simply set σ̂₁² = 0! 😵‍💫

σ₁²: var of treated
σ₀²: var of control
brunoferman.bsky.social
3/
Example to illustrate problem: comparison of means

Robust SEs estimate the variance of treated (controls) using only treated (controls) data

✅ Great with many treated/many controls!
↪️ Allow for ≠ distributions of treated/control errors

❗ Go bad with few treated units...
brunoferman.bsky.social
2/
🗣️Main message

Few treated ⇒ need to rely on stronger assumptions

Many alternatives: varying in data requirements, assumptions, etc

Choice is highly context-specific. We’ll help you navigate that!

Cover cross-section and panel data (Regression, Matching, DiD, SC, etc)
brunoferman.bsky.social
1/
Link to paper: arxiv.org/abs/2504.19841

🚨Problem
Few treated ⇒ standard methods (e.g., robust/clustered SEs) can go wrong. Even if total N is large!

📌Example
DiD with 1 treated cluster, clustered SEs underestimate true var by a factor of N. Expect over-rejections >60%!
Inference with few treated units
In many causal inference applications, only one or a few units (or clusters of units) are treated. An important challenge in such settings is that standard inference methods that rely on asymptotic th...
arxiv.org
brunoferman.bsky.social
🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!
Reposted by Bruno Ferman