Lightnews — Scholar-powered news

ExplainableML @eml-munich.bsky.social · Aug 4

6/
Disentanglement of Correlated Factors via Hausdorff Factorized Support (ICLR 2023)
@confusezius.bsky.social , Mark Ibrahim, @zeynepakata.bsky.social , Pascal Vincent, Diane Bouchacourt
[Paper]: arxiv.org/abs/2210.07347
[Code]: github.com/facebookrese...

Disentanglement of Correlated Factors via Hausdorff Factorized Support

A grand goal in deep learning research is to learn representations capable of generalizing across distribution shifts. Disentanglement is one promising direction aimed at aligning a model's representa...

arxiv.org

ExplainableML @eml-munich.bsky.social · Aug 4

5/
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts (ICCV 2023)
@confusezius.bsky.social *, Jae Myung Kim*, @askoepke.bsky.social , Cordelia Schmid , @zeynepakata.bsky.social
[Paper]: arxiv.org/abs/2306.07282
[Code]: github.com/ExplainableM...

Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

The visual classification performance of vision-language models such as CLIP has been shown to benefit from additional semantic knowledge from large language models (LLMs) such as GPT-3. In particular...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

4/
Vision-by-Language for Training-Free Composed Image Retrieval (ICLR 2024)
@shyamgopal.bsky.social *, @confusezius.bsky.social *, Massimiliano Mancini, @zeynepakata.bsky.social
[Paper]: arxiv.org/abs/2310.09291
[Code]: github.com/ExplainableM...

Vision-by-Language for Training-Free Compositional Image Retrieval

Given an image and a target modification (e.g an image of the Eiffel tower and the text "without people and at night-time"), Compositional Image Retrieval (CIR) aims to retrieve the relevant target im...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

3/
Fantastic Gains and Where to Find Them (ICLR 2024 Spotlight)
@confusezius.bsky.social *, @lukasthede.bsky.social *, @askoepke.bsky.social , Oriol Vinyals, @olivierhenaff.bsky.social , @zeynepakata.bsky.social
[Paper]: arxiv.org/abs/2310.17653
[Code]: github.com/ExplainableM...

Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model

Training deep networks requires various design decisions regarding for instance their architecture, data augmentation, or optimization. In this work, we find these training variations to result in net...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

[Paper]: arxiv.org/abs/2408.14471
[Code]: github.com/ExplainableM...

A Practitioner's Guide to Continual Multimodal Pretraining

Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep models u...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

2/
A Practitioner's Guide to Continual Multimodal Pretraining (NeurIPS 2024) @confusezius.bsky.social *, @vishaalurao.bsky.social *, @sbdzdz.bsky.social , Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, @olivierhenaff.bsky.social , @samuelalbanie.bsky.social , Matthias Bethge, @zeynepakata.bsky.social

A Practitioner's Guide to Continual Multimodal Pretraining

Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep models u...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

1/
Context-Aware multimodal pretraining (CVPR 2025 Highlight) @confusezius.bsky.social , @zeynepakata.bsky.social , @dimadamen.bsky.social , @ibalazevic.bsky.social , @olivierhenaff.bsky.social
[Paper]: arxiv.org/abs/2411.15099

Context-Aware Multimodal Pretraining

Large-scale multimodal representation learning successfully optimizes for zero-shot transfer at test time. Yet the standard pretraining paradigm (contrastive learning on large amounts of image-text da...

arxiv.org

1

ExplainableML @eml-munich.bsky.social · Aug 4

During his PhD, Karsten interned at Mata AI and Googlem DeepMind, working on generalization in representation learning and large-scale multimodal pretraining techniques.

👇Checkout his selected publications in top-tier conferences such as NeurIPS, ICLR, CVPR or ICCV:

1

ExplainableML @eml-munich.bsky.social · Aug 4

🦾 (Multimodal) model pretraining
🧠 Model generalization, reuse and transferability research
🎆 Continual (multimodal) training of such models.

1

ExplainableML @eml-munich.bsky.social · Aug 4

Karsten has been an ELLIS and IMPRS-IS PhD student since May 2021, supervised by both @zeynepakata.bsky.social and Oriol Vinyals. His research has been centered around robust and effective deployment of (large) neural networks in the real world, with particular focus on:

1 1

ExplainableML @eml-munich.bsky.social · Aug 4

🎓PhD Spotlight: Karsten Roth

Celebrate @confusezius.bsky.social , who defended his PhD on June 24th summa cum laude!

🏁 His next stop: Google DeepMind in Zurich!

Join us in celebrating Karsten's achievements and wishing him the best for his future endeavors! 🥳

1 2 9

Reposted by ExplainableML

simonroschmann.bsky.social @simonroschmann.bsky.social · Jul 3

This project was a collaboration between @eml-munich.bsky.social and Huawei Paris Noah’s Ark Lab. Thank you to my collaborators @qbouniot.bsky.social, Vasilii Feofanov, Ievgen Redko, and particularly to my advisor @zeynepakata.bsky.social for guiding me through my first PhD project!

1 2 3