Lightnews — Scholar-powered news

Greta Tuckute @gretatuckute.bsky.social · 4d

Check out @mryskina.bsky.social's talk and poster at COLM on Tuesday—we present a method to identify 'semantically consistent' brain regions (responding to concepts across modalities) and show that more semantically consistent brain regions are better predicted by LLMs.

Maria Ryskina @mryskina.bsky.social · 5d

Interested in language models, brains, and concepts? Check out our COLM 2025 🔦 Spotlight paper!

(And if you’re at COLM, come hear about it on Tuesday – sessions Spotlight 2 & Poster 2)!

Paper title: Language models align with brain regions that represent concepts across modalities.
Authors: Maria Ryskina, Greta Tuckute, Alexander Fung, Ashley Malkin, Evelina Fedorenko.
Affiliations: Maria is affiliated with the Vector Institute for AI, but the work was done at MIT. All other authors are affiliated with MIT.
Email address: maria.ryskina@vectorinstitute.ai.

4 14

Reposted by Greta Tuckute

Sam Nastase @samnastase.bsky.social · 7d

I'm recruiting PhD students to join my new lab in Fall 2026! The Shared Minds Lab at @usc.edu will combine deep learning and ecological human neuroscience to better understand how we communicate our thoughts from one brain to another.

8 66 110

Reposted by Greta Tuckute

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · 8d

Do you want to use AI models to understand human language?

Are you fascinated by whether linguistic representations are lurking in LLMs?

Are you in need of a richer model of spatial words across languages?

Consider UT Austin for all your Computational Linguistics Ph.D. needs!

mahowak.github.io

1 6

Reposted by Greta Tuckute

CMU Mellon College of Science @cmuscience.bsky.social · 8d

Elizabeth Lee, a first-year Ph.D. student in Neural Computation, has been awarded CMU’s 2025 Sutherland-Merlino Fellowship. Her work bridges neuroscience and machine learning, and she’s passionate about advancing STEM access for underrepresented groups.
www.cmu.edu/mcs/news-eve...

3 7

Reposted by Greta Tuckute

neurotaha @neurotaha.bsky.social · 9d

🚨 Paper alert:
To appear in the DBM Neurips Workshop

LITcoder: A General-Purpose Library for Building and Comparing Encoding Models

📄 arxiv: arxiv.org/abs/2509.091...
🔗 project: litcoder-brain.github.io

1 4 18

Reposted by Greta Tuckute

Martin Schrimpf @mschrimpf.bsky.social · 9d

Come be our colleague at EPFL! Several open calls for positions 🧪🧠🤖

* Neuroscience www.epfl.ch/about/workin... (deadline Oct 1)

* Life Science Engineering www.epfl.ch/about/workin...

* CS general call www.epfl.ch/about/workin...

* Learning Sciences www.epfl.ch/about/workin...

Faculty Position in Neuroscience

The School of Life Sciences at EPFL invites applications for a Tenure Track Assistant Professor position in Neuroscience. At EPFL researchers develop and apply innovative technologies to understand br...

www.epfl.ch

1 10 31

Reposted by Greta Tuckute

Badr AlKhamissi @bkhmsi.bsky.social · 13d

Now that the ICLR deadline is behind us, happy to share that From Language to Cognition has been accepted as an Oral at #EMNLP2025! 🎉

Looking forward to seeing many of you in Suzhou 🇨🇳

Badr AlKhamissi @bkhmsi.bsky.social · Mar 5

🚨 New Preprint!!

LLMs trained on next-word prediction (NWP) show high alignment with brain recordings. But what drives this alignment—linguistic structure or world knowledge? And how does this alignment evolve during training? Our new paper explores these questions. 👇🧵

1 3 20

Reposted by Greta Tuckute

Hannah Small @hsmall.bsky.social · 14d

Excited to share new work with @hleemasson.bsky.social , Ericka Wodka, Stewart Mostofsky and @lisik.bsky.social! We investigated how simultaneous vision and language signals are combined in the brain using naturalistic+controlled fMRI. Read the paper here: osf.io/b5p4n
1/n

1 11 46

Reposted by Greta Tuckute

Isabel Papadimitriou @isabelpapad.bsky.social · 21d

Are there conceptual directions in VLMs that transcend modality? Check out our COLM oral spotlight 🔦 paper! We use SAEs to analyze the multimodality of linear concepts in VLMs

with @chloesu07.bsky.social, @thomasfel.bsky.social, @shamkakade.bsky.social and Stephanie Gil
arxiv.org/abs/2504.11695

1 6 25

Reposted by Greta Tuckute

Dan Yamins @dyamins.bsky.social · 23d

Here is our best thinking about how to make world models. I would apologize for it being a massive 40-page behemoth, but it's worth reading. arxiv.org/pdf/2509.09737

arxiv.org

2 17 70

Reposted by Greta Tuckute

Naomi Saphra @nsaphra.bsky.social · 28d

I thought I wouldn‘t be one of those academics super into outreach talks, but I just put together something about understanding LLMs for laypeople and I get to talk about results that I don’t really focus on in any of my technical talks! It’s actually really cool. I made this lil takeaway slide

8 8 77

Reposted by Greta Tuckute

Marianne de Heer Kloots @mdhk.net · Aug 27

✨ Do self-supervised speech models learn to encode language-specific linguistic features from their training data, or only more language-general acoustic correlates?

At #Interspeech2025 we presented our new Wav2Vec2-NL model and SSL-NL evaluation dataset to test this!

📄 arxiv.org/abs/2506.00981

⬇️

Interspeech paper title: What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training

Authors: Marianne de Heer Kloots, Hosein Mohebbi, Charlotte Pouw, Gaofei Shen, Willem Zuidema, Martijn Bentum

1 6 19

Reposted by Greta Tuckute

EurIPS Conference @euripsconf.bsky.social · Aug 27

So, what is #EurIPS anyway? 🤔

EurIPS is a community-driven conference taking place in Copenhagen Denmark endorsed by @neuripsconf.bsky.social and @nordicair.bsky.social and co-developed with @ellis.eu, where you can additionally present your NeurIPS papers.

1 5 17

Reposted by Greta Tuckute

Marianne de Heer Kloots @mdhk.net · Aug 19

Had such a great time presenting our tutorial on Interpretability Techniques for Speech Models at #Interspeech2025! 🔍

For anyone looking for an introduction to the topic, we've now uploaded all materials to the website: interpretingdl.github.io/speech-inter...

2 14 39

Reposted by Greta Tuckute

David G. Clark @david-g-clark.bsky.social · Aug 19

Wanted to share a new version (much cleaner!) of a preprint on how connectivity structure shapes collective dynamics in nonlinear RNNs. Neural circuits have highly non-iid connectivity (e.g., rapidly decaying singular values, structured singular-vector overlaps), unlike classical random RNN models.

Connectivity structure and dynamics of nonlinear recurrent neural networks

Studies of the dynamics of nonlinear recurrent neural networks often assume independent and identically distributed couplings, but large-scale connectomics data indicate that biological neural circuit...

arxiv.org

1 9 40

Reposted by Greta Tuckute

Erin Grant @eringrant.me · Aug 19

I’m recruiting committee members for the Technical Program Committee at #CCN2026.

Please apply if you want to help make submission, review & selection of contributed work (Extended Abstracts & Proceedings) more useful for everyone! 🌐

Helps to have: programming/communications/editorial experience.

Grace Lindsay @neurograce.bsky.social · Aug 15

Fill out the form here to be involved in #CCN2026
tinyurl.com/ccn26committee

And definitely fill out the #CCN2025 feedback form!
tinyurl.com/ccn25partici...

CCN 2025 Participant Survey

❗ This is the general CCN post-conference survey. For the application form for positions on the committees at CCN 2026, please see the CCN 2026 Committee Application. Thank you for participating in C...

tinyurl.com

14 19

Greta Tuckute @gretatuckute.bsky.social · Aug 19

We hope that AuriStream will serve as a task-performant model system for studying how language structure is learned from speech.

The Interspeech paper sets the stage—more work building on this idea coming soon! And as always, please feel free to get in touch with comments etc.!

1

Greta Tuckute @gretatuckute.bsky.social · Aug 19

3️⃣ Temporally fine-grained → 5ms tokens preserve acoustic detail (e.g. speaker identity).
4️⃣ Unified → AuriStream learns strong speech representations and generates plausible continuations—bridging representation learning and sequence modeling in the audio domain.

1 2

Greta Tuckute @gretatuckute.bsky.social · Aug 19

4 key advantages of AuriStream:

1️⃣ Causal → allows the study of speech/language processing as it unfolds in real time.
2️⃣ Inspectable → predictions can naturally be decoded into the cochleagram/audio, enabling visualization and interpretation.

1 2

Greta Tuckute @gretatuckute.bsky.social · Aug 19

Examples: audio before red line = ground-truth prompt; after = AuriStream’s prediction, visualized in the time-frequency cochleagram space.

AuriStream shows that causal prediction over short audio chunks (cochlear tokens) is enough to generate meaningful sentence continuations!

1 1 3

Greta Tuckute @gretatuckute.bsky.social · Aug 19

Complementing AuriStream’s strong representational capabilities, AuriStream learns short- and long-range speech statistics—completing phonemes and common words at short scales, and generating diverse continuations at longer scales, as evident by the qualitative examples below.

1 1

Greta Tuckute @gretatuckute.bsky.social · Aug 19

We demonstrate that:

🔹 AuriStream embeddings capture information about phoneme identity, word identity, and lexical semantics.
🔹 AuriStream embeddings serve as a strong backbone on downstream audio tasks (SUPERB benchmark, such as ASR and intent classification).

1 1

Greta Tuckute @gretatuckute.bsky.social · Aug 19

We present a two-stage framework, loosely inspired by the human auditory hierarchy:

1️⃣ WavCoch: a small model that transforms raw audio into a cochlea-like time-frequency representation, from which we extract discrete “cochlear tokens”.
2️⃣ AuriStream: an autoregressive model over the cochlear tokens.

1 2

Greta Tuckute @gretatuckute.bsky.social · Aug 19

Many prior speech-based models rely on heuristics such as:
🔹 Global clustering of the embedding space
🔹 Non-causal objectives
🔹 Fixed-duration “language” units
...

We believe that no high-performing, open-source audio model exists without such constraints—AuriStream is built to fill that gap.

1 3

Greta Tuckute @gretatuckute.bsky.social · Aug 19

Joint with @klemenkotar.bsky.social, and with @evfedorenko.bsky.social @dyamins.bsky.social

Paper: www.isca-archive.org/interspeech_...

Website: tukoresearch.github.io/auristream-s... (with audio examples)

HuggingFace: huggingface.co/TuKoResearch...

ISCA Archive - Representing Speech Through Autoregressive Prediction of Cochlear Tokens

www.isca-archive.org

1 1 2