Lightnews — Scholar-powered news

Karsten Roth

@confusezius.bsky.social

💫 After four PhD years on all things multimodal, pre- and post-training, I’m super excited for a new research chapter at Google DeepMind 🇨🇭!

Biggest thanks to @zeynepakata.bsky.social and Oriol Vinyals for all the guidance, support, and incredibly eventful and defining research years ♥️!

August 4, 2025 at 2:59 PM

Karsten Roth

@confusezius.bsky.social

How does lifelong knowledge editing currently hold up in the real world? Fun new work probing where we are at these days with injecting new knowledge into LLMs!

Lukas Thede @lukasthede.bsky.social · Apr 8

🧠 Keeping LLMs factually up to date is a common motivation for knowledge editing.

But what would it actually take to support this in practice at the scale and speed the real world demands?

We explore this question and really push the limits of lifelong knowledge editing in the wild.
👇

April 8, 2025 at 7:59 PM

Reposted by Karsten Roth

Zeynep Akata

@zeynepakata.bsky.social

📄 Disentangled Representation Learning with the Gromov-Monge Gap

with Théo Uscidda, Luca Eyring, @confusezius.bsky.social, Fabian J Theis, Marco Cuturi

📄 Decoupling Angles and Strength in Low-rank Adaptation

with Massimo Bini, Leander Girrbach

January 24, 2025 at 8:02 PM

Reposted by Karsten Roth

Zeynep Akata

@zeynepakata.bsky.social

Our EML team has 4 #ICLR25 Papers accepted! I am proud of my students and grateful to be a part of many successful collaborations. More details will appear on our website (www.eml-munich.de) but here are the snapshots.

EML MunichEML MunichMenu

Explainable Machine Learning Munich

www.eml-munich.de

January 24, 2025 at 8:02 PM

Reposted by Karsten Roth

Alfredo Canziani

@alfcnz.bsky.social

The Practitioner's Guide to Continual Multimodal Pretraining @dziadzio.bsky.social @confusezius.bsky.social @vishaalurao.bsky.social @bayesiankitten.bsky.social

December 12, 2024 at 2:20 AM

Reposted by Karsten Roth

Luca Eyring

@lucaeyring.bsky.social

Can we enhance the performance of T2I models without any fine-tuning?

We show that with our ReNO, Reward-based Noise Optimization, one-step models consistently surpass the performance of all current open-source Text-to-Image models within the computational budget of 20-50 sec!
#NeurIPS2024

December 11, 2024 at 11:05 PM

Karsten Roth

@confusezius.bsky.social

How far can you push model merging over time, as more experts and options to model-merge arise?

We comprehensively and systematically investigate this in our new work, check it out!

Sebastian Dziadzio @dziadzio.bsky.social · Dec 11

📄 New Paper: "How to Merge Your Multimodal Models Over Time?"

arxiv.org/abs/2412.06712

Model merging assumes all finetuned models are available at once. But what if they need to be created over time?

We study Temporal Model Merging through the TIME framework to find out!

🧵

How to Merge Your Multimodal Models Over Time?

Model merging combines multiple expert models - finetuned from a base foundation model on diverse tasks and domains - into a single, more capable model. However, most existing model merging approaches...

arxiv.org

December 11, 2024 at 7:46 PM

Karsten Roth

@confusezius.bsky.social

😵‍💫 Continually pretraining large multimodal models to keep them up-to-date all-the-time is tough, covering everything from adapters, merging, meta-scheduling to data design and more!

So I'm really happy to present our large-scale study at #NeurIPS2024!

Come drop by to talk about all that and more!

December 10, 2024 at 4:42 PM

Reposted by Karsten Roth

ELLIS

@ellis.eu

🎉 Congratulations to our newly accepted ELLIS Fellows & Scholars in 2024! Top researchers in #MachineLearning join the network to advance science & mentor the next generation. #ELLISforEurope #AI

🌍 Know someone on the list? bit.ly/3ZJd9Cz
Tag them in a reply with congratulations.

161 outstanding machine learning researchers accepted as new ELLIS Fellows & Scholars

The ELLIS mission is to create a diverse European network that promotes research excellence and advances breakthroughs in AI, as well as a pan-European PhD program to educate the next generation of AI...

bit.ly

December 9, 2024 at 2:45 PM

Reposted by Karsten Roth

Vishaal Udandarao

@vishaalurao.bsky.social

🚀New Paper: Active Data Curation Effectively Distills Multimodal Models
arxiv.org/abs/2411.18674

Smol models are all the rage these days & knowledge distillation (KD) is key for model compression!

We show how data curation can effectively distill to yield SoTA FLOP-efficient {C/Sig}LIPs!!
🧵👇

December 2, 2024 at 5:59 PM

Reposted by Karsten Roth

Dima Damen

@dimadamen.bsky.social

Read our paper:
Context-Aware Multimodal Pretraining

Now on ArXiv

Can you turn vision-language models into strong any-shot models?

Go beyond zero-shot performance in SigLixP (x for context)

Read @confusezius.bsky.social thread below…

And follow Karsten … a rising star!

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 5:03 PM

Reposted by Karsten Roth

Alfredo Canziani

@alfcnz.bsky.social

Beautiful paper! 😍😍😍

Captions go above the tables, but otherwise aesthetically very pleasing.

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 29, 2024 at 12:39 AM

Reposted by Karsten Roth

Olivier Hénaff

@olivierhenaff.bsky.social

More than zero-shot generalization, few-shot *adaptation* is critical for many applications.

We find simple changes to multimodal pretraining are sufficient to yield outsized gains on a wide range of few-shot tasks.

Congratulations @confusezius.bsky.social on a very successful internship!

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 2:47 PM

Reposted by Karsten Roth

Ivana Balazevic

@ibalazevic.bsky.social

We maintain strong zero-shot transfer of CLIP / SigLIP across model size and data scale, while achieving up to 4x few-shot sample efficiency and up to +16% performance gains!

Fun project with @confusezius.bsky.social, @zeynepakata.bsky.social, @dimadamen.bsky.social and
@olivierhenaff.bsky.social.

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 2:43 PM

Karsten Roth

@confusezius.bsky.social

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 2:33 PM

Reposted by Karsten Roth

ELLIS

@ellis.eu

Hi 👋 We're glad to be here on @bsky.app and looking forward to engaging in this community. But first, learn a little more about us...

#ELLISforEurope #AI #ML #CrossBorderCollab #PhD

November 21, 2024 at 10:37 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news