Thomas Steinke
@stein.ke
4.1K followers 680 following 350 posts
Computer science, math, machine learning, (differential) privacy Researcher at Google DeepMind Kiwi🇳🇿 in California🇺🇸 http://stein.ke/
Posts Media Videos Starter Packs
Pinned
stein.ke
I'm going to slowly repost my math notes from the other site🐦 here🦋; it's the only thing I posted over there that I think may have some long-term value & worth not deleting.

These started out as notes for myself, but people seem to appreciate them. 😅

I'll keep track of all of them in this thread.
Reposted by Thomas Steinke
aaroth.bsky.social
The FORC 2026 call for papers is out! responsiblecomputing.org/forc-2026-ca... Two reviewing cycles with two deadlines: Nov 11 and Feb 17. If you haven't been, FORC is a great venue for theoretical work in "responsible AI" --- fairness, privacy, social choice, CS&Law, explainability, etc.
FORC 2026: Call for Papers
The 7th annual Symposium on Foundations of Responsible Computing (FORC) will be held on June 3-5, 2026 at Harvard University. Brief summary for those who are familiar with past editions (prior to 2…
responsiblecomputing.org
stein.ke
IMHO, the best analog to the AI bubble is the dotcom bubble. Yes, the internet proved to be economically transformative, but there was still a bubble. Companies made a lot of money in the end, but it wasn't necessarily the ones that people expected -- e.g., see CISCO:
Plot of CISCO's stock price from 1990-2025 showing huge growth around 2000 before dropping and slowly growing again.
stein.ke
The final piece of the puzzle is how do you choose the subsamples? Here's where having a relative who knows combinatorics was helpful. 😁 Basically, the subsets should form a covering design. And the minimal size of a covering design parameterizes the tradeoff.
stein.ke
OK, so can we get the best of both worlds? or at least trade off between the cost of privacy in terms of accuracy/data and the number of subsamples we need to evaluate on?

That's what we address in our new paper. The answer is yes, but the tradeoff is quite steep (and we have a lower bound).
differentialprivacy.org
Privately Estimating Black-Box Statistics

Günter F. Steinke, Thomas Steinke

http://arxiv.org/abs/2510.00322
Privately Estimating Black-Box Statistics

Günter F. Steinke, Thomas Steinke

http://arxiv.org/abs/2510.00322

Standard techniques for differentially private estimation, such as Laplace or
Gaussian noise addition, require guaranteed bounds on the sensitivity of the
estimator in question. But such sensitivity bounds are often large or simply
unknown. Thus we seek differentially private methods that can be applied to
arbitrary black-box functions. A handful of such techniques exist, but all are
either inefficient in their use of data or require evaluating the function on
exponentially many inputs. In this work we present a scheme that trades off
between statistical efficiency (i.e., how much data is needed) and oracle
efficiency (i.e., the number of evaluations). We also present lower bounds
showing the near-optimality of our scheme.
stein.ke
In this paper we showed that we can do this kind of versatile black-box estimation with only an *additive* cost of privacy, where sample-and-aggregate suffers a *multiplicative* cost. But this uses exponentially many subsamples - i.e., all subsets of size n-t instead of a partition into t parts.
differentialprivacy.org
Privately Evaluating Untrusted Black-Box Functions
Ephraim Linder, Sofya Raskhodnikova, Adam Smith, Thomas Steinke
http://arxiv.org/abs/2503.19268
Privately Evaluating Untrusted Black-Box Functions
Ephraim Linder, Sofya Raskhodnikova, Adam Smith, Thomas Steinke
http://arxiv.org/abs/2503.19268
We provide tools for sharing sensitive data when the data curator doesn't
know in advance what questions an (untrusted) analyst might ask about the data.
The analyst can specify a program that they want the curator to run on the
dataset. We model the program as a black-box function $f$. We study
differentially private algorithms, called privacy wrappers, that, given
black-box access to a real-valued function $f$ and a sensitive dataset $x$,
output an accurate approximation to $f(x)$. The dataset $x$ is modeled as a
finite subset of a possibly infinite set $U$, in which each entry represents
data of one individual. A privacy wrapper calls $f$ on the dataset $x$ and on
some subsets of $x$ and returns either an approximation to $f(x)$ or a
nonresponse symbol $\perp$. The wrapper may also use additional information
(that is, parameters) provided by the analyst, but differential privacy is
required for all values of these parameters. Correct setting of these
parameters will ensure better accuracy of the wrapper. The bottleneck in the
running time of our wrappers is the number of calls to $f$, which we refer to
as queries. Our goal is to design wrappers with high accuracy and low query
complexity. We introduce a novel setting, the automated sensitivity detection
setting, where the analyst supplies the black-box function $f$ and the intended
(finite) range of $f$. In the previously considered setting, the claimed
sensitivity bound setting, the analyst supplies additional parameters that
describe the sensitivity of $f$. We design privacy wrappers for both settings
and show that our wrappers are nearly optimal in terms of accuracy, locality
(i.e., the depth of the local neighborhood of the dataset $x$ they explore),
and query complexity. In the claimed sensitivity bound setting, we provide the
first accuracy guarantees that have n
stein.ke
So the obvious question is *can sample-and-aggregate can be made more data-efficient?* 🤔
E.g., instead of partitioning the dataset, can we use overlapping subsamples?
Unfortunately, using standard aggregation methods, this doesn't work (because overlapping subsamples means higher sensitivity).
stein.ke
...it's inefficient in terms of data. You only get accuracy from evaluating on a subsample & each subsample gets treated like one datapoint for the privacy analysis. In terms of sample complexity, the cost of privacy is multiplicative, when it _should_ be additive.

But it can be practical e.g.:
Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its tr...
arxiv.org
stein.ke
First, what is sample-and-aggregate?

It's a generic method for differentially private estimation. Partition your data into 1/ε subsamples; evaluate your function on each subsample; then combine the values in a ε-differentially private manner.

It's flexible - it works for *any* function - but...
cs-people.bu.edu
stein.ke
I'm excited to share this paper.

It answers a question that has bugged me for a long time: Can sample-and-aggregate be made more data-efficient? The answer is yes, but at a steep price in computational efficiency. See 🧵 for more details.

Also, it was a fun opportunity to add a new coauthor. 😁
differentialprivacy.org
Privately Estimating Black-Box Statistics

Günter F. Steinke, Thomas Steinke

http://arxiv.org/abs/2510.00322
Privately Estimating Black-Box Statistics

Günter F. Steinke, Thomas Steinke

http://arxiv.org/abs/2510.00322

Standard techniques for differentially private estimation, such as Laplace or
Gaussian noise addition, require guaranteed bounds on the sensitivity of the
estimator in question. But such sensitivity bounds are often large or simply
unknown. Thus we seek differentially private methods that can be applied to
arbitrary black-box functions. A handful of such techniques exist, but all are
either inefficient in their use of data or require evaluating the function on
exponentially many inputs. In this work we present a scheme that trades off
between statistical efficiency (i.e., how much data is needed) and oracle
efficiency (i.e., the number of evaluations). We also present lower bounds
showing the near-optimality of our scheme.
stein.ke
The fourth deadliest war in history killed 20-30 million people (more than WW1) and was a rebellion led by a man claiming to be the brother of Jesus Christ, and you've never even heard of it. 🤯
stein.ke
Apps need to ask for permission to access the camera/microphone. The same should apply to the speaker. Some apps should never make a sound. 🔇
stein.ke
Yeah, why not?
I was taught "non-decreasing" and mostly never questioned it, but really "non-strictly increasing" seems better.
stein.ke
Sure, of course it can matter, but do we need totally different terms? Wouldn't it make more sense to add a qualifier (strictly increasing vs. non-strictly increasing).
stein.ke
Why do we use "non-decreasing" as a synonym for increasing? The negation is confusing. E.g., the sine function is not decreasing. Is the distinction between strictly and non-strictly increasing meaningful?
stein.ke
On my way to work I saw a billboard that just said "caffeine" – as if that shit needs to advertise.
stein.ke
I just had to enter my birthdate using an interface that required me to click once for every month of my life. This is offensive on multiple levels.
stein.ke
🤦🤦🤦 this is not how two factor identification works 🤦🤦🤦🤦
[Screenshot of an app]
Code Verification 
Call customer service at 1-866-4220306 (outside the U.S. call 1-210-677-0065) to retrieve your One-time identification Code .
Reposted by Thomas Steinke
focs2025.bsky.social
To reduce barriers to attendance, #FOCS2025 will try to facilitate childcare for attendees during the conference.

Information, how to register your interest, and how to apply for financial support: focs.computer.org/2025/childca...

Deadline (for the latter): ⏰ September 19th (AoE)
Childcare Support – FOCS 2025
focs.computer.org
stein.ke
Not _good_ code, evidently.
stein.ke
Today I wrote some code in which print("failed successfully") made perfect sense. 🙃
stein.ke
It's probably just laziness. Email-based authentication is easy to implement, cheaper than SMS, & more reliable than passwords etc. It offloads security to your email provider.
stein.ke
The year is 2100. The verb "to coldplay" means to inadvertently confess by acting guilty when caught doing something otherwise innocuous. No one remembers the etymology.
stein.ke
Ban in-app browsers.
stein.ke
Amazon informs me that Prime day is expected to occur between July 8 and July 11. Does anyone know what the confidence level for that interval is? Is it 95%? When will we get more precise measurements?