Lightnews — Scholar-powered news

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

If you want to learn more about how to apply influence functions to diffusion models, and the key take-aways for their use in this setting, check-out the paper! arxiv.org/abs/2410.13850

Influence Functions for Scalable Data Attribution in Diffusion Models

Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to ...

arxiv.org

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

For example: for even moderately sized datasets, the trained diffusion models' marginal probability distribution stays the same irrespective of which examples were removed from the training data, potentially making the influence functions task vacuous.

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

We also point out several empirical challenges to the use of influence functions in diffusion models.

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

In our paper, we empirically show that the choice of GGN and K-FAC approximation is crucial for the performance of influence functions, and that following our recommended design principles leads to the better performing approximations.

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

Influence functions require the training loss Hessian matrix. Typically, a K-FAC approximation to a Generalised Gauss-Newton (GGN) matrix is used instead of the Hessian. However, it's not immediately obvious which GGN and K-FAC approximations to use in the diffusion

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

Influence functions are already being used in deep learning, from classification and regression through to autoregressive LLMs. What's the challenge in adapting them to the diffusion setting?

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

• Identifying and removing data responsible for undesirable behaviours (e.g. generating explicit content)
• Data valuation (how much did each training datapoint contribute towards generating the samples my users pay me for?)

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

Answering how a model's behaviour changes upon removing training datapoints could help with:
• Quantifying impact of copyrighted data on a given sample (how much less likely is it that the model would generate this image if not for the works of a given artist?)

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

Influence functions attempt to answer: how would the model's behaviour (e.g. probability of generating an image) change if the model was trained from scratch with some training datapoints removed.

They give an approximate answer, but without actually retraining the model.

1

Bruno Mlodozeniec @brunokm.bsky.social · Apr 16

How do you identify training data responsible for an image generated by your diffusion model? How could you quantify how much copyrighted works influenced the image?

In our ICLR oral paper we propose how to approach such questions scalably with influence functions.

1 2

Bruno Mlodozeniec @brunokm.bsky.social · Mar 21

It’s an awesome piece of work, done on a surprisingly small budget compared to the performance

1

Bruno Mlodozeniec @brunokm.bsky.social · Mar 21

Rich Turner with other members of our group recently published a paper on Aardvark — end-to-end weather prediction with deep learning — in Nature, and it was just featured in The Guardian and Financial Times!

www.theguardian.com/technology/2...

AI-driven weather prediction breakthrough reported

Researchers say Aardvark Weather uses thousands of times less computing power and is much faster than current systems

www.theguardian.com

1 1 6

Bruno Mlodozeniec @brunokm.bsky.social · Dec 5

Myself, James and and Shreyas will be at NeurIPS presenting this work. Come chat to us if you’re interested!

James Allingham @jamesallingham.bsky.social · Dec 5

I'll be at NeurIPS next week, presenting our work "A Generative Model of Symmetry Transformations." In it, we propose a symmetry-aware generative model that discovers which (approximate) symmetries are present in a dataset and can be leveraged to improve data efficiency.

🧵⬇️

1 6

Bruno Mlodozeniec @brunokm.bsky.social · Nov 15

Diffusion models are so ubiquitous, but it's difficult to find an introduction that is concise, simple and comprehensive.

My supervisor Rich Turner (with me & some other students) has written an introduction to diffusion models that fills this gap:

arxiv.org/abs/2402.04384

Denoising Diffusion Probabilistic Models in Six Simple Steps

Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video generati...

arxiv.org

1 4