Lightnews — Scholar-powered news

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

Come chat! 🎤
I'll be presenting this work at #CogSci2025:
📍 Poster Number P1-B-8
🗓️ Poster Session: Poster Session 1
🧠 Poster title: “Spot the Ball: Evaluating Visual Causal Inference in VLMs under Occlusion”

1 4

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

We also built:
✅ An inpainting-based image generation pipeline
✅ A public demo where you can test your visual inference skills
✅ A dataset of 3000+ labeled soccer images for future work

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

Results:
Humans outperform all models—even with chain-of-thought scaffolding.
GPT-4o gets closer with explicit pose/gaze cues, but still falls short in many cases.

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

Three prompt types, increasing in reasoning complexity:
🔹 Basic: “Which grid cell contains the ball?”
🔹 Implicit: Encourages attention to pose/gaze
🔹 Chain-of-thought: Step-by-step inference

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

The task is mapped to a 6×10 grid → a 60-class classification problem.
We benchmark humans and models (GPT-4o, Gemini, LLaMA, Qwen) on soccer, basketball, and volleyball.

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

In high-stakes, real-world scenes, humans infer what's missing, a crucial skill in driving, robotics, and sports.
We isolate this in a simple but rich task: spot the masked ball from a single frame.

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

The Spot the Ball game has been around for decades.
🗓️ It began in the UK in the 1970s as a popular newspaper contest
👥 At its peak, over 3 million people played weekly
Players had to guess where the ball had been removed from a photo—just like our benchmark does today.

1 1

nbalamur.bsky.social @nbalamur.bsky.social · Jul 28

🧠⚽ Spot the ball! New benchmark for visual scene understanding!
We ask: Can people and models locate a hidden ball in sports images using only visual context and reasoning?
🕹️ Try the task: v0-new-project-9b5vt6k9ugb.vercel.app
#CogSci2025

2 6 10