Lightnews — Scholar-powered news

Reposted by Antonia Wüst

Martin Trapp @trappmartin.bsky.social · 19d

Unfortunately, our submission to #NeurIPS didn’t go through with (5,4,4,3). But because I think it’s an excellent paper, I decided to share it anyway.

We show how to efficiently apply Bayesian learning in VLMs, improve calibration, and do active learning. Cool stuff!

📝 arxiv.org/abs/2412.06014

Post-hoc Probabilistic Vision-Language Models

Vision-language models (VLMs), such as CLIP and SigLIP, have found remarkable success in classification, retrieval, and generative tasks. For this, VLMs deterministically map images and text descripti...

arxiv.org

2 14 48

Antonia Wüst @toniwuest.bsky.social · Aug 20

And last but not least: the spirals are still spinning, each in their own direction 🌀

1

Antonia Wüst @toniwuest.bsky.social · Aug 20

💻 We also added a demo of the evaluation to our GitHub repo! Check it out here: github.com/ml-research/...

bongard-in-wonderland/demo.ipynb at main · ml-research/bongard-in-wonderland

Contribute to ml-research/bongard-in-wonderland development by creating an account on GitHub.

github.com

1

Antonia Wüst @toniwuest.bsky.social · Aug 20

📊 Updated results are also on our webpage!
Link: ml-research.github.io/bongard-in-w...
Curious to hear - should we evaluate other models too? 🤖

Bongard in Wonderland

ml-research.github.io

1

Antonia Wüst @toniwuest.bsky.social · Aug 20

🔎 Importantly, Task 2 continues to expose inconsistencies between the solved problems in Task 1 (64) and the problems where the model can correctly classify the individual images of the problem (only 34), given the gt options (Task 2).

1

Antonia Wüst @toniwuest.bsky.social · Aug 20

🤔 Surprisingly, even some easy problems like BP8 remain unsolved…

1

Antonia Wüst @toniwuest.bsky.social · Aug 20

Can the new GPT-5 model finally solve Bongard Problems? 👉Not quite yet!
Using our ICML Bongard in Wonderland setup, it solved 64/100 problems - the best score so far! 📈
However, some issues still persist ⬇️

1 5

Reposted by Antonia Wüst

wolfstammer.bsky.social @wolfstammer.bsky.social · Jul 7

Can concept-based models handle complex, object-rich images? We think so! Meet Object-Centric Concept Bottlenecks (OCB) — adding object-awareness to interpretable AI. Led by David Steinmann w/ @toniwuest.bsky.social & @kerstingaiml.bsky.social .
📄 arxiv.org/abs/2505.244...
#AI #XAI #NeSy #CBM #ML

4 10

Antonia Wüst @toniwuest.bsky.social · Jul 12

I'll be at #ICML2025 next week presenting our recent work on VLMs and Bongard Problems! Feel free to reach out, happy to have a chat ☺️

3

Antonia Wüst @toniwuest.bsky.social · May 2

Work together with my amazing co-authors @philosotim.bsky.social
Lukas Helff @ingaibs.bsky.social @wolfstammer.bsky.social @devendradhami.bsky.social @c-rothkopf.bsky.social @kerstingaiml.bsky.social ! ✨

1 4

Antonia Wüst @toniwuest.bsky.social · May 2

We also identified 10 particularly challenging Bongard Problems that none of the models could solve under any setting. The challenge remains wide open!
3 examples of the challenging BPs:

1 1 2

Antonia Wüst @toniwuest.bsky.social · May 2

Interestingly, success in solving the BPs (Open Question) doesn't translate to correctly categorizing individual images 👉 the sets of BPs solved in each task are not the same!
This suggests that getting the right final answer doesn’t always mean genuine understanding 🤔

1 1 1

Antonia Wüst @toniwuest.bsky.social · May 2

Our evaluation shows the top-performing model (o1) solved 43 out of 100 problems, with the others trailing far behind. There’s still a long way to go for current AI models!

1 1

Antonia Wüst @toniwuest.bsky.social · May 2

Excited to share that our paper got accepted at #ICML2025!! 🎉

We challenge Vision-Language Models like OpenAI’s o1 with Bongard problems, classic visual reasoning challenges and uncover surprising shortcomings.

Check out the paper: arxiv.org/abs/2410.19546
& read more below 👇

1 10 24