Lightnews — Scholar-powered news

Martin Ziqiao Ma @marstin.bsky.social · 6d

Regrettably can’t attend #COLM2025 due to deadlines, but
Jane and Joyce will be presenting our work. :)

Jane is an exceptional undergraduate researcher and a great collaborator! Go meet her at COLM if you’re curious about her work on mechanistic interpretability, multimodality, & pragmatics!

Martin Ziqiao Ma @marstin.bsky.social · Apr 23

Vision-Language Models are not yet pragmatically optimal.

We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.

Reposted by Martin Ziqiao Ma

Freda Shi @fredashi.bsky.social · 12d

🚀 ACL ARR is looking for a Co-CTO to join me lead our amazing tech team and drive the future of our workflow. If you’re interested or know someone who might be, let’s connect!

RTs & recommendations appreciated.

ACL Rolling Review (ARR) @aclrollingreview.bsky.social · 12d

🚨 ARR is looking for a volunteer Co-CTO to help improve tech infrastructure!
🛠️ Preferred:
• 5+ years in NLP research
• Git, CLI tools, Python, and basic HTML
• 2-year role, overlapping with current Co-CTO
Interested? DM @fredashi.bsky.social or email [email protected]
#ARR #ACL #NLProc

1 3 2

Martin Ziqiao Ma @marstin.bsky.social · Jul 28

Unfortunately, I’ll be missing #ACL2025NLP this year — but here are a few things I’m excited about! 👇

Martin Ziqiao Ma @marstin.bsky.social · Jul 22

Congratulations!!

1 1

Martin Ziqiao Ma @marstin.bsky.social · Jul 14

with @fredashi.bsky.social / Jiayuan Mao / @djiafei.bsky.social / @manlingli.bsky.social / David Hsu / Parisa Kordjamshidi

Martin Ziqiao Ma @marstin.bsky.social · Jul 14

📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI!

Join us in San Diego to push the frontiers of spatial understanding and reasoning across CV, NLP, and robotics!

👉 space-in-vision-language-embodied-ai.github.io

1 1

Reposted by Martin Ziqiao Ma

Hokin @hokin.bsky.social · Jun 30

#CoreCognition #LLM #multimodal #GrowAI We spent 3 years to curate 1503 classic experiments spanning 12 core concepts in human cognitive development and evaluated on 230 MLLMs with 11 different prompts for 5 times to get over 3.8 millions inference data points.

A thread (1/n) - #ICML2025 ✅

1 9 13

Reposted by Martin Ziqiao Ma

Hokin @hokin.bsky.social · Jun 11

New Paper Alert ‼️ Current VLMs completely fail human gaze understanding 🙀 and scaling does NO help ‼️

However, humans, since an extremely age 🧒, are extremely sensitive to other people's gaze 🙄 👀

No mentors, no labs, only pre-doc students, 111 VLMs, and we did it 😎

1 5 6

Reposted by Martin Ziqiao Ma

JHU Computer Science @jhucompsci.bsky.social · Jun 10

& @tianminshu.bsky.social (+ @marstin.bsky.social, @zhitinghu.bsky.social, ‪@lianhui.bsky.social & more) will present “SimWorld: A World Simulator for Scaling Photorealistic Multi-Agent Interactions,” an @unrealengine.bsky.social-based sim that generates unlimited/diverse urban environments: (13/14)

SimWorld

SimWorld: A World Simulator for Scaling Photorealistic Multi-Agent Interactions

simworld-cvpr2025.maitrix.org

1 1 1

Martin Ziqiao Ma @marstin.bsky.social · Apr 30

At Albuquerque Now :)

1

Martin Ziqiao Ma @marstin.bsky.social · Apr 29

See you at #NAACL2025! I will talk about grounded lexicon acquisition and scaling mechanistically grounded vision language models. Happy to chat if you are around :)

Freda Shi @fredashi.bsky.social · Apr 29

On my way to NAACL✈️! If you're also there and interested in grounding, don't miss our tutorial on "Learning Language through Grounding"!
Mark your calendar: May 3rd, 14:00-17:30, Ballroom A.

Another exciting collaboration with @marstin.bsky.social @kordjamshidi.bsky.social, Jiayuan, and Joyce!

1

Martin Ziqiao Ma @marstin.bsky.social · Apr 23

We introduce RefOI, a new dataset of 1.5k objects, each with 3 written and 2 spoken human-produced referring expressions. We also release RefOI-TLHF, a large dataset of token-level human feedback for 10.6k referring expressions.

👀https://vlm-reg.github.io/
📄https://arxiv.org/abs/2504.16060

VLMs Are Not Pragmatically Competent in Referring Expression Generation

VLMs fail to refer like humans. Our study reveals widespread pragmatic issues in GPT-4o, LLaVA, and others, showing how their expressions often violate Gricean maxims.

vlm-reg.github.io

1

Martin Ziqiao Ma @marstin.bsky.social · Apr 23

Vision-Language Models are not yet pragmatically optimal.

We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.

1 3 4

Martin Ziqiao Ma @marstin.bsky.social · Apr 19

🔹 Workshop Paper at World Models:
Do Vision-Language Models Have Internal World Models?
🗓 Apr 27, 9 p.m. (Peridot 201&206)

Paper: openreview.net/forum?id=tpP...

Excited for this collaboration with MaitrixOrg, details coming soon :)

Do Vision-Language Models Have Internal World Models? Towards an...

Internal world models (WMs) enable agents to understand the world's state and predict transitions, serving as the basis for advanced deliberative reasoning. Recent large Vision-Language Models...

openreview.net

1

Martin Ziqiao Ma @marstin.bsky.social · Apr 19

🔹 ICLR BiAlign Workshop:
We’re hosting the Bidirectional Human-AI Alignment Workshop (BiAlign).
🗓 Apr 28, (Garnet 216–214)

Website: bialign-workshop.github.io

I’ll join remotely — huge thanks to @huashen.bsky.social for leading this!

1 3

Martin Ziqiao Ma @marstin.bsky.social · Apr 19

🔹 ICLR Oral Paper:
Do Vision-Language Models Represent Space and How?

🗓 Oral: Apr 25, 3:42–3:54 a.m. (Session 4C)
🗓 Poster: Thu, Apr 24, 10 p.m.–12:30 a.m. (Hall 3 + 2B, #212)

Website: spatial-comfort.github.io

Big thanks to @fredashi.bsky.social for presenting on site!

1

Martin Ziqiao Ma @marstin.bsky.social · Apr 19

I won’t be attending #ICLR2025 in person since #NAACL2025 follows right after, but here are a few things I’m excited about (all time in EDT) ⬇️