Lightnews — Scholar-powered news

Phillip Isola @phillipisola.bsky.social · 1d

This work is with an amazing team including @sophielwang.bsky.social, @thisismyhat.bsky.social, Sharut Gupta, @shobsund.bsky.social, Chenyu Wang, and Stefanie Jegelka.

9/9

1

Phillip Isola @phillipisola.bsky.social · 1d

More broadly, I think confusion has been created by forming hard distinctions between different modalities, especially between text and sensory data. These distinctions can obscure commonalities. We take the rhetorical stance of erasing the distinctions, and seeing where this leads.

8/9

1 4

Phillip Isola @phillipisola.bsky.social · 1d

This work was partially inspired by Ilya Sutskever's talk here: www.youtube.com/watch?v=AKMu...

If you concatenate datasets, the model “should” figure out all the synergies and cross-modal relationships, then exploit them to make better inferences. We now have some evidence this can happen.

7/9

An Observation on Generalization

YouTube video by Simons Institute for the Theory of Computing

www.youtube.com

1 2

Phillip Isola @phillipisola.bsky.social · 1d

Suppose you have separate datasets X, Y, Z, without known correspondences.

We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].

You end up with a better model of dataset X than if you had trained on X alone!

6/9

Architecture for Unpaired Multimodal Learner.

1 1 1

Phillip Isola @phillipisola.bsky.social · 1d

In “Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models,” we study a question I’ve wanted to make progress on for years: can you learn useful multimodal representations from *unpaired* data?

5/9

1 2

Phillip Isola @phillipisola.bsky.social · 1d

In short: you can “just ask” an LLM to act (a bit) like an image model or an audio model.

This tells us that LLMs know more about the sensory world than we might suspect; you just have to find ways to elicit the knowledge.

4/9

1 3

Phillip Isola @phillipisola.bsky.social · 1d

In “Words That Make Language Models Perceive,” we find if you ask an LLM to “imagine seeing,” then how it processes text becomes more like how a vision system would represent that same scene.

If you ask it to “imagine hearing,” its representation becomes more like that of an auditory model.

3/9

Diagram showing how prompts can steer an LLM toward kernel structure that better matches that of sensory encoders.

1 3

Phillip Isola @phillipisola.bsky.social · 1d

For context, this work stems from the idea that all data modalities (images, sounds, text, etc) are views of the same underlying world, and that treating them as such is useful.

We are interested in identifying commonalities between different models and modalities, and providing unifications.

2/9

1 1 1

Phillip Isola @phillipisola.bsky.social · 1d

Over the past year, my lab has been working on fleshing out theory + applications of the Platonic Representation Hypothesis.

Today I want to share two new works on this topic:

Eliciting higher alignment: arxiv.org/abs/2510.02425
Unpaired learning of unified reps: arxiv.org/abs/2510.08492

1/9

1 10 49

Phillip Isola @phillipisola.bsky.social · 2d

Oh I think you are right about the review process at least. Sometimes it rewards the inverse of my metric: a fancy new technique that doesn't actually achieve any new result / understanding :)

2

Phillip Isola @phillipisola.bsky.social · 2d

I think papers like that are great! One of my personal metrics for paper quality is: delta in capability / delta in technique. A paper that only changes one parameter and achieves much better results should get a best paper award by this metric :)

2 13

Phillip Isola @phillipisola.bsky.social · Aug 6

Interesting reaction from ChatGPT to the HHS mRNA memo. It finds it so implausible that it thinks it's fake. From the perspective of a ~2024(?) trained model, 2025 policies are so absurd as to be unbelievable...

chatgpt.com/share/689364...

3 13 54

Phillip Isola @phillipisola.bsky.social · Jul 31

Unless it turns out it that capable intelligence is actually not so simple!

1

Phillip Isola @phillipisola.bsky.social · Jul 31

Yeah, it helps me to consider that much of the history of science has been about finding a simpler-than-expected explanation of something that previously seemed magical: life (evolution), motion of the planets (law of gravitation), etc. Now those are among our most celebrated discoveries.

1 2

Phillip Isola @phillipisola.bsky.social · Jul 31

Of course, personally, I think we need not shy away from this possibility. Maybe intelligence is simpler than we thought, and there's a beauty in that too.

3 11

Phillip Isola @phillipisola.bsky.social · Jul 31

I think part of it is that people might be overestimating the complexity of intelligence, and it's hard not to.

How weird it would be if an LLM (a Markov chain!) could explain "thinking".

It feels like it makes us less special, like Copernicus placing the sun at the center, rather than the Earth.

3 3 27

Phillip Isola @phillipisola.bsky.social · Jul 27

I enjoy your posts! I hope you keep at it.

6

Phillip Isola @phillipisola.bsky.social · Jul 16

Finite: right, you would need to train the student on inputs beyond the GT x's.

Wrong: the teacher could underfit and be more correct than the "GT" y's. This paper is about one version of this: arxiv.org/abs/2206.15477

Denoised MDPs: Learning World Models Better Than the World Itself

The ability to separate signal from noise, and reason with clean abstractions, is critical to intelligence. With this ability, humans can efficiently perform real world tasks without considering all p...

arxiv.org

1 6

Phillip Isola @phillipisola.bsky.social · Jul 16

One reason is that GT may be finite or, yes, wrong. A regression model fit to GT can potentially generalize beyond the GT and correct errors.

I like to think of this as: the data is a bad model of the world.

1 4

Phillip Isola @phillipisola.bsky.social · Jul 12

To me it’s more like there exist some sci fi that predicted pretty well each of the things we are seeing, although no sci fi got them all right. But that still just seems incredible given that what happened is an infinitesimal point in the space of all possibilities…

1

Phillip Isola @phillipisola.bsky.social · Jul 12

Yeah and relatedly it’s odd to me when people dismiss future predictions as “pure science fiction” as if that means they are wrong or unlikely. The overwhelming feeling I have when reading old sci fi is how accurately it often predicted what came to pass.

2 2

Phillip Isola @phillipisola.bsky.social · Jul 12

I agree, maybe we need to qualify novelty wrt the audience: like novelty(x | me), novelty(x | biologists), novelty(x | 6th graders), novelty(x | world’s top expert), etc. All are useful. Some are more like “research” others are more like “teaching”, all are quite related.

1 4

Phillip Isola @phillipisola.bsky.social · Jul 12

Ah weird, yeah you are probably right for CS but for the natural sciences I think “novelty” often means “novel finding” rather than “novel method”, as in we discovered something new about the world. I like that definition more! Agree that most papers need not have any new algorithm to be worthwhile.

1 4

Phillip Isola @phillipisola.bsky.social · Jul 12

I’m more in the “pro novelty” camp but I think maybe it’s because I see novelty differently. I think, for example, that showing that known method X solves open problem Y is hugely novel. For me novelty is basically: did I learn something new and important from this work.

2 16

Reposted by Phillip Isola

#CVPR2026 @cvprconference.bsky.social · Jun 19

#CVPR2025 provided coaching for all orals. Do you think the talks were improved compared to last year?

* Better than last year
* About the same
* Worse than last year

Share your thoughts in the thread!

3 10 14