Lightnews — Scholar-powered news

Reposted by Noah Snavely

Andreas Geiger @andreasgeiger.bsky.social · 8d

#TTT3R: 3D Reconstruction as Test-Time Training
TTT3R offers a simple state update rule to enhance length generalization for #CUT3R — No fine-tuning required!
🔗Page: rover-xingyu.github.io/TTT3R
We rebuilt @taylorswift13’s "22" live at the 2013 Billboard Music Awards - in 3D!

4 36

Reposted by Noah Snavely

Nathaniel Burgdorfer @nate-burgdorfer.bsky.social · May 30

We present a new approach to inference-time scene optimization, which we name Radiant Triangle Soup (RTS) www.arxiv.org/abs/2505.23642. Also check out really great concurrent work from Held et al. @janheld.bsky.social, Triangle Splatting arxiv.org/abs/2505.19175

2 5

Reposted by Noah Snavely

Shiry Ginosar @shiryginosar.bsky.social · Jul 15

🧠How “old” is your model?

Put it to the test with the KiVA Challenge: a new benchmark for abstract visual reasoning, grounded in real developmental data from children and adults.

🏆 Prizes:
🥇$1K to the top model
🥈🥉$500
📅 Deadline: 10/7/25
🔗 kiva-challenge.github.io
@iccv.bsky.social

KiVA Challenge @ ICCV 2025

kiva-challenge.github.io

1 12 22

Noah Snavely @snavely.bsky.social · Jul 11

(ChatGPT claims that this piece is Twinkle Twinkle Little Star, while Gemini says it is Do-Re-Me.)

4

Noah Snavely @snavely.bsky.social · Jul 11

ChatGPT and Gemini both seem to struggle with sheet music. They both insist that this excerpt is in D major (2 sharps), and resist any attempt to tell them that there 3 sharps in the key signature. I think this is really cool and interesting!

2 12

Reposted by Noah Snavely

Shiry Ginosar @shiryginosar.bsky.social · Apr 23

Think LMMs can reason like a 3-year-old?

Think again!

Our Kid-Inspired Visual Analogies benchmark reveals where young children still win: ey242.github.io/kiva.github....

Catch our #ICLR2025 poster today to see where models still fall short!

Thurs. April 24
3-5:30 pm
Halls 3 + 2B #312

2 7 24

Reposted by Noah Snavely

Zhenjun Zhao @ericzzj.bsky.social · Apr 25

Dynamic Camera Poses and Where to Find Them

Chris Rockwell, @jtung.bsky.social, Tsung-Yi Lin, Ming-Yu Liu, David F. Fouhey, Chen-Hsuan Lin

tl;dr: a large-scale dataset of dynamic Internet videos annotated with camera poses

arxiv.org/abs/2504.17788

1 2 5

Reposted by Noah Snavely

Rundong Luo @redfairy2002.bsky.social · Apr 23

1/6 🔍➡️ How to transform standard videos into immersive 360° panoramas? We've designed a new AI system for video-to-360° panorama generation!

Our key insight: large-scale data is crucial for robust panoramic synthesis across diverse scenes.

5 1 3

Reposted by Noah Snavely

linyijin.bsky.social @linyijin.bsky.social · Apr 15

We have released the Stereo4D dataset! Explore the real-world dynamic 3D tracks: github.com/Stereo4d/ste...

3 13

Noah Snavely @snavely.bsky.social · Apr 14

This is really nice work on visual discovery from @boyangdeng.bsky.social!

Boyang Deng @boyangdeng.bsky.social · Apr 14

Curious about how cities have changed in the past decade? We use MLLMs to analyse 40 million Street View images to answer this. Do you know that "'juice shops' became a thing in NYC" and "miles of overpasses were painted BLUE in SF"? More at→boyangdeng.com/visual-chronicles (vid ↓ w/ 🔊)

6

Reposted by Noah Snavely

carldoersch.bsky.social @carldoersch.bsky.social · Apr 9

We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: tap-next.github.io

1 9 23

Reposted by Noah Snavely

Jon Barron @jonbarron.bsky.social · Apr 8

A thread of thoughts on radiance fields, from my keynote at 3DV:

Radiance fields have had 3 distinct generations. First was NeRF: just posenc and a tiny MLP. This was slow to train but worked really well, and it was unusually compressed --- The NeRF was smaller than the images.

2 21 92

Reposted by Noah Snavely

Mor Naaman @informor.bsky.social · Apr 5

Fifth Ave jammed #handsoff

29 540 4.1K

Reposted by Noah Snavely

Haian Jin @haian-jin.bsky.social · Apr 5

🚀 We’ve just released the code and checkpoints for our #ICLR2025 Oral paper: "LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias".

Check it out below 👇

🔗 Code: github.com/haian-jin/LVSM
📄 Paper: arxiv.org/abs/2410.17242
🌐 Project Page: haian-jin.github.io/projects/LVSM/

2 18

Noah Snavely @snavely.bsky.social · Mar 30

This is really cool work!

Anand Bhattad @anandbhattad.bsky.social · Mar 29

[1/10] Is scene understanding solved?

Models today can label pixels and detect objects with high accuracy. But does that mean they truly understand scenes?

Super excited to share our new paper and a new task in computer vision: Visual Jenga!

📄 arxiv.org/abs/2503.21770
🔗 visualjenga.github.io

1 1 7

Reposted by Noah Snavely

Anand Bhattad @anandbhattad.bsky.social · Mar 29

[1/10] Is scene understanding solved?

Models today can label pixels and detect objects with high accuracy. But does that mean they truly understand scenes?

Super excited to share our new paper and a new task in computer vision: Visual Jenga!

📄 arxiv.org/abs/2503.21770
🔗 visualjenga.github.io

7 14 58

Reposted by Noah Snavely

Cornell Tech @cornelltech.bsky.social · Mar 12

#Backslash at #CornellTech, dedicated to advancing new works of art and technology that escape convention, has announced Mimi Ọnụọha as its first Backslash Fellow: tech.cornell.edu/news/mimi-on...

“This work feels like a marked evolution for me personally,” said Ọnụọha.

@snavely.bsky.social

1 3

Reposted by Noah Snavely

#CVPR2026 @cvprconference.bsky.social · Mar 11

#CVPR2025 offers registration and travel support to students from underrepresented communities. Awards are based on need, contribution, travel distance, identity, and advisor support.

Information and form: forms.gle/uDR2Q74drC4V...

Broadening Participation Scholarship Form

CVPR 2025 Travel and Registration Support Application CVPR'25 is committed to supporting students from communities that do not traditionally attend CVPR through registration and travel support. Allocation is based on a combination of need, contribution to the conference, where you are traveling from, the community(ies) you identify with and advisor support. Travel support will be issued in fixed amounts that will be based on availability of funds and travel distance. If you would like to be considered for this support, please complete the following application. Decisions will be made on a rolling basis. Applications will be accepted until April 19 2025 (anywhere on earth).

forms.gle

4 7

Noah Snavely @snavely.bsky.social · Mar 2

Very nice! Is this a thing that happens each night at the hotel?

1

Noah Snavely @snavely.bsky.social · Feb 26

This is really bad!

Reposted by Noah Snavely

Angjoo Kanazawa @akanazawa.bsky.social · Feb 24

Exciting news! MegaSAM code is out🔥 & the updated Shape of Motion results with MegaSAM are really impressive! A year ago I didn't think we could make any progress on these videos: shape-of-motion.github.io/results.html
Huge congrats to everyone involved and the community 🎉

3 17 75

Noah Snavely @snavely.bsky.social · Feb 24

Very interesting! The guy who loves singing through a megaphone comes to mind, but I think he came later.

1

Noah Snavely @snavely.bsky.social · Feb 19

The Dispossesed is an interesting choice! I didn't know it had a big influence.

Noah Snavely @snavely.bsky.social · Feb 18

Very interesting -- thank you!

Noah Snavely @snavely.bsky.social · Feb 18

I think Qianqian et al's work is really cool! The problem of modeling state within a 3D reasoning system is quite interesting.

(And I believe it's pronounced "cuter".)

Qianqian Wang @qianqianwang.bsky.social · Feb 18

Late to post, but excited to introduce CUT3R!

An online 3D reasoning framework for many 3D tasks directly from just RGB. For static or dynamic scenes. Video or image collections, all in one!

Project Page: cut3r.github.io
Code and Model: github.com/CUT3R/CUT3R

8