Lightnews — Scholar-powered news

Jiaang Li @jiaangli.bsky.social · Jul 14

Feel free to reach out and chat with Xinyi on July 18th in Vancouver at the #ICML

Xinyi Chen @xinyichen2024.bsky.social · Jul 13

Excited to present at the #ICML2025 World Models Workshop!
📅 July 18, 15:45–17:00
🧠 What if Othello-Playing Language Models Could See?
We show that visual grounding improves prediction & internal structure.♟️

Reposted by Jiaang Li

Serge Belongie @serge.belongie.com · Mar 30

Would you present your next NeurIPS paper in Europe instead of traveling to San Diego (US) if this was an option? Søren Hauberg (DTU) and I would love to hear the answer through this poll: (1/6)

NeurIPS participation in Europe

We seek to understand if there is interest in being able to attend NeurIPS in Europe, i.e. without travelling to San Diego, US. In the following, assume that it is possible to present accepted papers ...

docs.google.com

6 160 280

Reposted by Jiaang Li

Sebastian Loeschcke @sloeschcke.bsky.social · Jun 3

Check out our new preprint 𝐓𝐞𝐧𝐬𝐨𝐫𝐆𝐑𝐚𝐃.
We use a robust decomposition of the gradient tensors into low-rank + sparse parts to reduce optimizer memory for Neural Operators by up to 𝟕𝟓%, while matching the performance of Adam, even on turbulent Navier–Stokes (Re 10e5).

2 7 29

Reposted by Jiaang Li

Pioneer Centre for AI @aicentre.dk · Jun 2

PhD student, Jiaang Li and his collaborators, with insights into cultural understanding of vision-language models 👇

Jiaang Li @jiaangli.bsky.social · May 23

🚀New Preprint🚀
Can Multimodal Retrieval Enhance Cultural Awareness in Vision-Language Models?

Excited to introduce RAVENEA, a new benchmark aimed at evaluating cultural understanding in VLMs through RAG.
arxiv.org/abs/2505.14462

More details:👇

1 1

Reposted by Jiaang Li

Srishti @srishtiy.bsky.social · Jun 2

I am excited to announce our latest work 🎉 "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory". We review recent works on culture in VLMs and argue for deeper grounding in cultural theory to enable more inclusive evaluations.

Paper 🔗: arxiv.org/pdf/2505.22793

Paper title "Cultural Evaluations of Vision-Language Models
Have a Lot to Learn from Cultural Theory"

3 18 57

Jiaang Li @jiaangli.bsky.social · May 23

Great collaboration with @yfyuan01.bsky.social @wenyan62.bsky.social @aliannejadi.bsky.social @danielhers.bsky.social , Anders Søgaard, Ivan Vulić, Wenxuan Zhang, Paul Liang, Yang Deng, @serge.belongie.com

2

Jiaang Li @jiaangli.bsky.social · May 23

🔗More here:
Project Page: jiaangli.github.io/RAVENEA/
Code: github.com/yfyuan01/RAV...
Dataset: huggingface.co/datasets/jaa...

jaagli/ravenea · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

1 1

Jiaang Li @jiaangli.bsky.social · May 23

📊Our experiments demonstrate that even lightweight VLMs, when augmented with culturally relevant retrievals, outperform their non-augmented counterparts and even surpass the next larger model tier, achieving at least a 3.2% improvement in cVQA and 6.2% in cIC.

1

Jiaang Li @jiaangli.bsky.social · May 23

🛠Culture-Aware Contrastive Learning

We propose Culture-aware Contrastive (CAC) Learning, a supervised learning framework compatible with both CLIP and SigLIP architectures. Fine-tuning with CAC can help models better capture culturally significant content.

1 1

Jiaang Li @jiaangli.bsky.social · May 23

📚 Dataset Construction
RAVENEA integrates 1,800+ images, 2,000+ culture-related questions, 500+ human captions, and 10,000+ human-ranked Wikipedia documents to support two key tasks:

🎯Culture-focused Visual Question Answering (cVQA)
📝Culture-informed Image Captioning (cIC)

1 1

Jiaang Li @jiaangli.bsky.social · May 23

🚀New Preprint🚀
Can Multimodal Retrieval Enhance Cultural Awareness in Vision-Language Models?

Excited to introduce RAVENEA, a new benchmark aimed at evaluating cultural understanding in VLMs through RAG.
arxiv.org/abs/2505.14462

More details:👇

1 7 17

Jiaang Li @jiaangli.bsky.social · May 23

Super cool! Incidentally, in our previous project, we also found that linear alignment between embedding spaces from two modalities is viable — and the alignment improves as LLMs scale.
bsky.app/profile/jiaa...

Jiaang Li @jiaangli.bsky.social · Nov 19

🤔Do Vision and Language Models Share Concepts? 🚀
We present an empirical evaluation and find that language models partially converge towards representations isomorphic to those of vision models. #EMNLP

📃 direct.mit.edu/tacl/article...

9

Reposted by Jiaang Li

Yifei Yuan @yfyuan01.bsky.social · Apr 21

I won’t be attending #ICLR in person this year😢. But feel free to check our paper ‘Revisiting the Othello World Model Hypothesis’ with Anders Søgaard, accepted at ICLR world models workshop!
Paper link arxiv.org/abs/2503.04421

Revisiting the Othello World Model Hypothesis

Li et al. (2023) used the Othello board game as a test case for the ability of GPT-2 to induce world models, and were followed up by Nanda et al. (2023b). We briefly discuss the original experiments, ...

arxiv.org

2 1

Reposted by Jiaang Li

Zhaochong An @zhaochongan.bsky.social · Feb 11

Thrilled to announce "Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation" is accepted as a Spotlight (5%) at #ICLR2025!

Our model MM-FSS leverages 3D, 2D, & text modalities for robust few-shot 3D segmentation—all without extra labeling cost. 🤩

arxiv.org/pdf/2410.22489

More details👇

1 7 26

Reposted by Jiaang Li

Chengzu @chengzu-li.bsky.social · Jan 14

Forget just thinking in words.

🔔Our New Preprint:
🚀 New Era of Multimodal Reasoning🚨
🔍 Imagine While Reasoning in Space with MVoT

Multimodal Visualization-of-Thought (MVoT) revolutionizes reasoning by generating visual "thoughts" that transform how AI thinks, reasons, and explains itself.

1 1 6

Reposted by Jiaang Li

Nico Lang @nicolang.bsky.social · Jan 9

FGVC12 Workshop is coming to #CVPR 2025 in Nashville!

Are you working on fine-grained visual problems?
This year we have two peer-reviewed paper tracks:
i) 8-page CVPR Workshop proceedings
ii) 4-page non-archival extended abstracts
CALL FOR PAPERS: sites.google.com/view/fgvc12/...

3 10

Reposted by Jiaang Li

Serge Belongie @serge.belongie.com · Dec 30

Here’s a short film produced by the Danish Royal Academy of Sciences, showcasing the WineSensed 🍷 project of Þóranna Bender et al. thoranna.github.io/learning_to_...

VidenSkaber | Min AI forstår mig ikke - professor Serge Belongie

YouTube video by Videnskabernes Selskab

youtu.be

3 17

Reposted by Jiaang Li

Belongie Lab @belongielab.org · Dec 21

From San Diego to New York to Copenhagen, wishing you Happy Holidays!🎄

4 39

Reposted by Jiaang Li

Belongie Lab @belongielab.org · Dec 3

With @neuripsconf.bsky.social right around the corner, we’re excited to be presenting our work soon! Here’s an overview

(1/5)

1 6 16

Reposted by Jiaang Li

Belongie Lab @belongielab.org · Nov 25

Here’s a starter pack with members of our lab that have joined Bluesky

Belongie Lab

Join the conversation

go.bsky.app

4 13

Reposted by Jiaang Li

Christoph Molnar @christophmolnar.bsky.social · Nov 24

No one can explain stochastic gradient descent better than this panda.

a panda bear is rolling around in the grass in a zoo enclosure .

Alt: a panda bear is rolling around in the grass in a zoo enclosure .

media.tenor.com

10 32 220

Jiaang Li @jiaangli.bsky.social · Nov 24

🙋‍♂️

Jiaang Li @jiaangli.bsky.social · Nov 19

Great collaboration with @constanzafierro.bsky.social , @YovaKem_v2, and Anders Søgaard!

👨‍💻 github.com/jiaangli/VLCA
📃 direct.mit.edu/tacl/article...

GitHub - jiaangli/VLCA: Do Vision and Language Models Share Concepts? A Vector Space Alignment Study

Do Vision and Language Models Share Concepts? A Vector Space Alignment Study - jiaangli/VLCA

github.com

Jiaang Li @jiaangli.bsky.social · Nov 19

🚀Take away:

1. Representation spaces of LMs and VMs grow more partially similar with model size.
2. Lower frequency, polysemy, dispersion can be easier to align.
3. Shared concepts between LMs and VMs might extend beyond nouns.

🧵(7/8)
#NLP #NLProc

1

Jiaang Li @jiaangli.bsky.social · Nov 19

🌱We then discuss the implications of our finding:
- the LM understanding debate
- the study of emergent properties
- philosophy

🧵(6/8)

1