Lightnews — Scholar-powered news

Edward H. Kennedy @edwardhkennedy.bsky.social · Jul 1

1 4

Edward H. Kennedy @edwardhkennedy.bsky.social · Jun 16

www.youtube.com/watch?v=jiwk...

Juno - This Is The Way It Goes And Goes And Goes (Full Album) (1999)

YouTube video by Diego Molina

www.youtube.com

Edward H. Kennedy @edwardhkennedy.bsky.social · Apr 6

Awesome!

Reposted by Edward H. Kennedy

Rachel Leah Childers @donskerclass.bsky.social · Mar 28

Went to look up textbook results after getting the nagging feeling that an ML paper was reinventing classical ideas, and found this gem:

"Not reading to the end of Le Cam's papers became not uncommon in later years. His ideas have been regularly rediscovered."

At least they're in good company.

Text from van der Vaart, "Asymptotic Statistics" Ch 27, http://www.stat.yale.edu/~pollard/Books/LeCamFest/VanderVaart.pdf

The theorem may have looked to somewhat too complicated to gain popularity. Nevertheless Hájek's result, for general locally asymptotically normal models and general loss functions, is now considered the final result in this direction, Hájek wrote:

"The proof that local asymptotic minimax implies local asymptotic admissibility was first given by LeCam (1953, Theorem 14). ... Apparently not many people have studied Le Cam's paper so far as to read this very last theorem, and the present author is indebted to Professor LeCam for giving him the reference"

Not reading to the end of Le Cam's papers became not uncommon in later years. His ideas have been regularly rediscovered

1 3 18

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 17

Ok I think I'll stop now :) I'm always amazed at how ahead of its time this work was.

It's too bad it's not as widely known among us causal+ML people

2

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 17

Once you have a pathwise differentiable parameter, a natural estimator is a debiased plug-in, which subtracts off the avg of estimated influence fn

Pfanzagl gives this 1-step estimator here - in causal inference this is exactly the doubly robust / DML estimator you know & love!

1 1

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 17

Pfanzagl uses pathwise differentiability above, but w/regularity conditions this is just a distributional Taylor expansion, which is easier to think about

I note this in my tutorial here:

www.ehkennedy.com/uploads/5/8/...

Also v related to so-called "Neyman orthogonality" - worth separate thread

x.com

1 1

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 17

Here’s Pfanzagl on the gradient of a functional/parameter, aka derivative term in a von Mises expansion, aka influence function, aka Neyman-orthogonal score

Richard von Mises first characterized smoothness this way for stats in the 30s/40s! eg:

projecteuclid.org/journals/ann...

1 1

Reposted by Edward H. Kennedy

Edward H. Kennedy @edwardhkennedy.bsky.social · Sep 30

From twitter:

A short thread:

It amazes me how many crucial ideas underlying now-popular semiparametrics (aka doubly robust parameter/functional estimation / TMLE / double/debiased/orthogonal ML etc etc) were first proposed many decades ago.

I think this is widely under-appreciated!

3 12 45

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 11

The m-estimator logic certainly relies on “exactly correct”

Once you start moving to “close enough” to me that means you’re no longer getting precise root-n rates with the nuisances. Then you’ll have to deal with the bias/variance consequences just as if you were using flexible ML

3

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 11

And here for more specific discussion:

arxiv.org/pdf/2405.08525

I think DR estimation vs inference are two quite different things and we need different assumptions to make them work

arxiv.org

2

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 11

If we really rely on 2 parametric models, we should of course use a variance estimator recognizing this. But this is more about how we model nuisances vs DR estimator itself

Also our paper here suggests strictly more assumptions are needed for DR inference vs estimation:

arxiv.org/pdf/2305.04116

arxiv.org

1 3

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 11

I find it much more believable that I could estimate both nuisances consistently, but at slower rates, vs that I could pick 2 parametric models (without looking at data) & happen to get one exactly correct

2 5

Edward H. Kennedy @edwardhkennedy.bsky.social · Feb 11

Hm not sure I agree with this logic…

To me the beautiful thing about the DR estimator is you can get away with estimating both nuisances at slower rates (as long as the product is < 1/sqrt(n))

This opens the door to using much more flexible methods - random forests, lasso, ensembles, etc etc

2 3

Edward H. Kennedy @edwardhkennedy.bsky.social · Jan 13

"Randomized trials should be used to answer any causal question that can be so studied...

But the reality is that observational methods are used everyday to answer pressing causal questions that cannot be studied in randomized trials."

- Jamie Robins, 2002
tinyurl.com/4yuxfxes
tinyurl.com/zncp39mr

2 3 24

Reposted by Edward H. Kennedy

Peter Hull @instrumenthull.bsky.social · Dec 27

What's the best paper you read this year?

13 4 35

Edward H. Kennedy @edwardhkennedy.bsky.social · Dec 19

Here's the recent paper!

bsky.app/profile/edwa...

Edward H. Kennedy @edwardhkennedy.bsky.social · Nov 13

In this paper we consider incremental effects of continuous exposures:

arxiv.org/abs/2409.11967

i.e., soft interventions on cts treatments like dose, duration, frequency

it turns out exponential tilts preserve all nice properties of incremental effects with binary trt (arxiv.org/abs/1704.00211)

8

Reposted by Edward H. Kennedy

Iván Díaz @idiaz.bsky.social · Dec 13

Thank you Alec for leading this project, I learned a lot! This paper has a very useful study of what contrasts are feasible in situations with many treatments and positivity violations, including necessary assumptions and efficient one-step estimators. Check it out!

Alec McClean @alecmcclean.bsky.social · Dec 13

New-ish paper alert! arxiv.org/abs/2410.13522

We tackle the challenge of comparing multiple treatments when some subjects have zero prob. of receiving certain treatments. Eg, provider profiling: comparing hospitals (the “treatments”) for patient outcomes. Positivity violations are everywhere.

Fair comparisons of causal parameters with many treatments and positivity violations

Comparing outcomes across treatments is essential in medicine and public policy. To do so, researchers typically estimate a set of parameters, possibly counterfactual, with each targeting a different ...

arxiv.org

3 12

Reposted by Edward H. Kennedy

Alec McClean @alecmcclean.bsky.social · Dec 13

New-ish paper alert! arxiv.org/abs/2410.13522

We tackle the challenge of comparing multiple treatments when some subjects have zero prob. of receiving certain treatments. Eg, provider profiling: comparing hospitals (the “treatments”) for patient outcomes. Positivity violations are everywhere.

Fair comparisons of causal parameters with many treatments and positivity violations

Comparing outcomes across treatments is essential in medicine and public policy. To do so, researchers typically estimate a set of parameters, possibly counterfactual, with each targeting a different ...

arxiv.org

1 5 28

Reposted by Edward H. Kennedy

Gautam Kamath @gautamkamath.com · Dec 13

Found slides by Ankur Moitra (presented at a TCS For All event) on "How to do theoretical research." Full of great advice!

My favourite: "Find the easiest problem you can't solve. The more embarrassing, the better!"

Slides: drive.google.com/file/d/15VaT...
TCS For all: sigact.org/tcsforall/

3 29 130

Reposted by Edward H. Kennedy

Alec McClean @alecmcclean.bsky.social · Dec 13

@bonv.bsky.social presented this at NYU this week -- terrific work with an excellent presentation (no surprise there)! I found the connections to higher-order estimators and the orthogonalizing property of the U-stat kernel fascinating&illuminating.

1 1 2

Edward H. Kennedy @edwardhkennedy.bsky.social · Dec 13

led by the amazing Matteo Bonvini @bonv.bsky.social

www.matteobonvini.com

matteobonvini

Matteo Bonvini Assistant Professor Department of Statistics Rutgers University [email protected] Google Scholar twitter.com/bonv3

www.matteobonvini.com

1 3

Edward H. Kennedy @edwardhkennedy.bsky.social · Dec 13

has lots of connections to doubly robust inference

academic.oup.com/biomet/artic...

arxiv.org/abs/1905.00744

arxiv.org/abs/2107.06124

1 1

Edward H. Kennedy @edwardhkennedy.bsky.social · Dec 13

Should we use structure-agnostic (arxiv.org/abs/2305.04116) or smooth (arxiv.org/pdf/1512.02174) models for causal inference?

Why not both?

Here we propose novel hybrid smooth+agnostic model, give minimax rates, & new optimal methods

arxiv.org/pdf/2405.08525

-> fast rates under weaker conditions

1 1 20

Reposted by Edward H. Kennedy

Iván Díaz @idiaz.bsky.social · Dec 6

I see renewed discussion on #statsky about the interpretation of confidence intervals. I will leave here this quote from Larry Wasserman's All of Statistics, which I love. Controlling one's lifetime proportion of studies with an interval that does not contain the parameter is surely desirable!

1 4 35