Lightnews — Scholar-powered news

Bruno Ferman @brunoferman.bsky.social · May 1

Thanks, Julian!!!

1

Reposted by Bruno Ferman

Julian Reif @julianreif.bsky.social · May 1

Good overview of what to do when there are only a few treated units
#EconSky

Bruno Ferman @brunoferman.bsky.social · Apr 29

🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!

1 2 4

Reposted by Bruno Ferman

Arin Dube @arindube.bsky.social · May 1

Very useful resource.

Bruno Ferman @brunoferman.bsky.social · Apr 29

🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!

1 1 18

Bruno Ferman @brunoferman.bsky.social · May 1

Thanks, Arin!!!

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

That's a quick overview!

For more details, check out the full survey 📚👇

Link: arxiv.org/abs/2504.19841

Hope you find it helpful!
Feedback welcome. 🧠✍️

Inference with few treated units

In many causal inference applications, only one or a few units (or clusters of units) are treated. An important challenge in such settings is that standard inference methods that rely on asymptotic th...

arxiv.org

Bruno Ferman @brunoferman.bsky.social · Apr 29

15/
Applied folks: we hope this serves as a warning that standard inference may fail with few treated units + guidance on choosing alternatives.
Econometricians: we wanted to provide a state-of-the-art overview — and a call for new methods based on alternative assumptions!

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

14/
And show some equivalences:

e.g., wild-bootstrap (with null imposed) asymptotically equivalent to sign-changes when N₁ is fixed and N₀ → ∞

⇒ theoretical justification for wild-bootstrap in these settings

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

13/
We also provide finite-N₀ improvements for some methods, such as Conley-Taber and sign-changes.

Free lunch: gains with finite N₀ & asymptotic equivalent when N₀ → ∞ (with N₁ fixed)

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

12/
What if we have >1 treated (but still few)?

More info on treated ⇒ alternatives: sign-changes, Behrens-Fisher solutions, etc

Relax some assumptions relative to previous methods (but need new ones!)
⚡Power may be an issue when N₁ is very small

Many relevant trade-offs!

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

11/
In these extreme cases: need to impose strong restrictions on treatment effect heterogeneity!

If interested, see discussion in Section 4.1.3 on inference on sharp nulls, inference on realized treatment effects, prediction intervals, and sensitivity analysis.

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

10/
📌Extrapolate from time series

Learn about treated error using pre-treatment residuals

⚡Flip assumptions
Need time series restrictions (stationarity) but relax assumptions on cross-section

Challenges arise when counterfactuals are estimated via high-dimensional approaches

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

9/
Ferman and Pinto (2019): allow for heteroskedasticity that can be estimated based on observables.

Example: when units have different variances due to variation in population sizes.

See this old Twitter thread: x.com/bruno_ferman...

x.com

1 2

Bruno Ferman @brunoferman.bsky.social · Apr 29

8/
📌Extrapolate (learn) from control units

Learn the distribution of the treated error using controls' residuals (à la Conley and Taber)
⚡Key assumption: Errors of treated and control units must have the same distribution (homoskedasticity)
No restriction on time series!

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

7/
Survey is organized based on data availability.

📌Limit case:
One treated unit & one treated period.

Enough info from the treated to construct an estimator — but no info from the treated to learn its distribution!

⚡Solution:
We need to *extrapolate* ⇒ stronger assumptions!

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

6/
We focus on model-based approaches, more common in metrics

📚 Nice citation from Haavelmo to justify this framework + marvel movies to help make the point 🕷️: )

We also discuss design-based approaches at the end

1 2

Bruno Ferman @brunoferman.bsky.social · Apr 29

5/
Important:
📌Problems arise when the *number* of treated units is small

✅Standard methods are usually fine with 40 or 50 treated units, even when the *share* of treated is small.

Feel free to cite our survey to justify sticking to standard methods when that's your case!😉

1 3

Bruno Ferman @brunoferman.bsky.social · Apr 29

4/
Extreme case: you have only 1 treated and N₀ controls.

The true variance is σ₁² + σ₀²/N₀.

But with only one treated, you just don’t have enough info to estimate σ₁² using only the treated!

Robust SEs simply set σ̂₁² = 0! 😵‍💫

σ₁²: var of treated
σ₀²: var of control

1 1

Bruno Ferman @brunoferman.bsky.social · Apr 29

3/
Example to illustrate problem: comparison of means

Robust SEs estimate the variance of treated (controls) using only treated (controls) data

✅ Great with many treated/many controls!
↪️ Allow for ≠ distributions of treated/control errors

❗ Go bad with few treated units...

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

2/
🗣️Main message

Few treated ⇒ need to rely on stronger assumptions

Many alternatives: varying in data requirements, assumptions, etc

Choice is highly context-specific. We’ll help you navigate that!

Cover cross-section and panel data (Regression, Matching, DiD, SC, etc)

1 2

Bruno Ferman @brunoferman.bsky.social · Apr 29

1/
Link to paper: arxiv.org/abs/2504.19841

🚨Problem
Few treated ⇒ standard methods (e.g., robust/clustered SEs) can go wrong. Even if total N is large!

📌Example
DiD with 1 treated cluster, clustered SEs underestimate true var by a factor of N. Expect over-rejections >60%!

Inference with few treated units

In many causal inference applications, only one or a few units (or clusters of units) are treated. An important challenge in such settings is that standard inference methods that rely on asymptotic th...

arxiv.org

1

Bruno Ferman @brunoferman.bsky.social · Apr 29

🧵New survey paper: "Inference with Few Treated Units"
Luis Alvarez, Bruno Ferman and Kaspar Wüthrich

Tired of referees saying your standard errors are wrong?

This survey will help you understand if you really have a problem — and, if so, how to fix it!

1 15 58

Reposted by Bruno Ferman

Nico Ajzenman @nicolasajz.bsky.social · Jan 27

Our "Discrimination in the Formation of Academic Networks: A Field Experiment on #EconTwitter" with Pedro Sant'Anna and @brunoferman.bsky.social is now forthcoming in the

😀😀

American Economic Review Insights (@AEAjournals )

😀😀 papers.ssrn.com/sol3/papers....

Discrimination in the Formation of Academic Networks: A Field Experiment on #EconTwitter

This paper experimentally documents discrimination in the formation of professional networks among academic economists. We created fictitious human-like bot acc

papers.ssrn.com

3 23 68