Lightnews — Scholar-powered news

Alexander Terenin @avt.im · 9d

Check out the work at: arxiv.org/abs/2502.14790

And, again, shoutout to amazing coauthor Jeff Negrea! Working together has been a great pleasure!

Stay tuned for follow-up: we've been working on using this viewpoint to understand other correlated perturbation-based algorithms.

Bayesian Algorithms for Adversarial Online Learning: from Finite to Infinite Action Spaces

We develop a form Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner's prior is defined over the space of an adversary's future...

arxiv.org

4

Alexander Terenin @avt.im · 9d

So this sums up the work! If you followed along, thanks for the interest!

I think you'd agree that "Bayesian Algorithms for Adversarial Online Learning: from Finite to Infinite Action Spaces" is a much better title than before. The old one was much harder to pronounce.

1 1

Alexander Terenin @avt.im · 9d

The Bayesian viewpoint proves useful for developing this analysis.

It allows us to guess what a good prior will be, and suggests ways to use probability as a tool to prove the algorithm works.

1 1

Alexander Terenin @avt.im · 9d

We prove that the Bayesian approach works in this setting too.

To achieve this, we develop a new probabilistic analysis of correlated Gaussian follow-the-perturbed-leader algorithms, of which ours is a special case.

This has been an open challenge in the area.

1 1

Alexander Terenin @avt.im · 9d

The second one is where X = [0,1]^d and Y is the space of bounded Lipschitz functions.

Here, you can't use a prior with independence across actions. You need to share information between actions.

We do this by using a Gaussian process, with correlations between actions.

1 1

Alexander Terenin @avt.im · 9d

The first one is the classical discrete setting where standard algorithms such as exponential weights are studied.

You can use a Gaussian prior which is independent across actions.

1 1

Alexander Terenin @avt.im · 9d

Okay, so we now know what "Bayesian Algorithms for Adversarial Online Learning" are.

What about "from Finite to Infinite Action Spaces"?

This covers the two settings we show the aforementioned results in.

1 1

Alexander Terenin @avt.im · 9d

This approach appears to not make any sense: the Bayesian model is completely fake.

We're pretending to know a distribution for how the adversary will act in the future.

But, in reality, they can do anything.

And yet... we show that this works!

1 2

Alexander Terenin @avt.im · 9d

We show that this game secretly has a natural Bayesian strategy - one we show is strong.

What's the strategy?

It's really simple:
- Place a prior distribution of what the adversary will do in the future
- Condition on what the adversary has done
- Sample from the posterior

1 1

Alexander Terenin @avt.im · 9d

One observation about adversarial online learning is that it appears to have nothing to do with Bayesian learning.

There is a two-player zero-sum game, not a joint probability distribution.

So you can't just solve it by applying Bayes' Rule. Or can you?

1 1

Alexander Terenin @avt.im · 9d

Okay, so now we understand what "Adversarial Online Learning" is.

We propose "Bayesian Algorithms" for this.

What does that mean? Let's unpack.

1 1

Alexander Terenin @avt.im · 9d

Online learning is therefore a good model for learning to explore by taking random actions.

In contrast to other approaches to resolving explore-exploit tradeoffs such as upper confidence bounds which produce purely deterministic strategies.

1 1

Alexander Terenin @avt.im · 9d

So that's our setting. Why's it interesting?

Because many other hard decision problems can be reduced to online learning, including certain forms of reinforcement learning (via decision-estimation coefficients), equilibrium computation (via no-regret dynamics), and others.

1 1

Alexander Terenin @avt.im · 9d

Specifically, their goal is to minimize regret

R(p,q) = E_{x_t~p_t, y_t~q_t} \sup_{x\in X} \sum_{t=1}^T y_t(x) - \sum_{t=1}^T y_t(x_t).

Meaning, the learner compares the sum of their rewards y_t(x_t) with the sum of y_t(x) for the best possible single non-time-dependent x.

1 1

Alexander Terenin @avt.im · 9d

The learner's goal is to achieve the highest rewards possible. But at each time, the adversary can choose a different reward function.

So why is this game not impossible?

Because the learner only compares how well they do with the *sum* of the adversary's previous rewards.

1 1

Alexander Terenin @avt.im · 9d

"Adversarial Online Learning" refers to a two-player zero-sum repeated game between a learner and adversary.

At each time point:
- The learner chooses a distribution of predictions p_t over an action space X.
- The adversary chooses a reward function y_t : X -> R.

1 1

Alexander Terenin @avt.im · 9d

First a link: arxiv.org/abs/2502.14790

Now, let's unpack the new title!

Let's start with what we mean by "Adversarial Online Learning".

Bayesian Algorithms for Adversarial Online Learning: from Finite to Infinite Action Spaces

We develop a form Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner's prior is defined over the space of an adversary's future...

arxiv.org

1

Alexander Terenin @avt.im · 9d

Paper update: our recent work on Thompson sampling has a shiny new - and I hope much better - name!

This new name does much better job of emphasizing what we actually do.

Joint work with Jeff Negrea. Thread below!

1 3 8

Alexander Terenin @avt.im · Jul 17

Project page: geospsr.github.io
Paper link: arxiv.org/abs/2503.19136
Link to my student's tweets on this work: x.com/sholalkere/s...

Stochastic Poisson Surface Reconstruction with One Solve using Geometric Gaussian Processes

geospsr.github.io

3

Alexander Terenin @avt.im · Jul 17

At ICML, we're presenting a paper on uncertainty-aware surface reconstruction!

Compared to previous approaches, we are able to completely remove the need for recursive linear solves for reconstruction and interpolation, using geometric GP machinery.

Check it out!

1 3 10

Alexander Terenin @avt.im · Jun 2

Check out all of this season's seminars here: gp-seminar-series.github.io

Virtual Seminar Series on Bayesian Decision-making and Uncertainty

gp-seminar-series.github.io

2

Alexander Terenin @avt.im · Jun 2

This week’s virtual seminar on Bayesian Decision-making and Uncertainty is happening now!

Noémie Jaquier (KTH Royal Institute of Technology)
On Riemannian Latent Variable Models and Pullback Metrics

Livestream link: www.youtube.com/watch?v=61Be...

1 3

Alexander Terenin @avt.im · May 30

If you're interested in this work, check out the updated preprint! It's got a lot of new stuff!

Link: arxiv.org/abs/2502.14790

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

We develop a form Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner's prior is defined over the space of an adversary's future...

arxiv.org

3

Alexander Terenin @avt.im · May 30

We obtain a very simple condition which relates the prior covariance kernel with the adversary's function class in an easy-to-verify way that works in the bounded Lipschitz case.

I am very interested in extensions to more-general smoothness classes, and have ideas. Stay tuned!

1 1

Alexander Terenin @avt.im · May 30

This stands in contrast with prior arguments, which are linear-algebraic in flavor, involve bounding certain matrix norms by certain traces, and essentially-require independence in order to give sharp rates.

1