Lightnews — Scholar-powered news

Reposted by Sasha Boguraev @ COLM

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · 1d

UT Austin Linguistics is hiring in computational linguistics!

Asst or Assoc.

We have a thriving group sites.utexas.edu/compling/ and a long proud history in the space. (For instance, fun fact, Jeff Elman was a UT Austin Linguistics Ph.D.)

faculty.utexas.edu/career/170793

🤘

UT Austin Computational Linguistics Research Group – Humans processing computers processing humans processing language

sites.utexas.edu

1 17 28

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · 2d

I will be giving a short talk on this work at the COLM Interplay workshop on Friday (also to appear at EMNLP)!

Will be in Montreal all week and excited to chat about LM interpretability + its interaction with human cognition and ling theory.

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models.

New work with @kmahowald.bsky.social and @cgpotts.bsky.social!

🧵👇!

5 8

Reposted by Sasha Boguraev @ COLM

Kanishka Misra 🌊 @kanishka.bsky.social · 8d

The compling group at UT Austin (sites.utexas.edu/compling/) is looking for PhD students!

Come join me, @kmahowald.bsky.social, and @jessyjli.bsky.social as we tackle interesting research questions at the intersection of ling, cogsci, and ai!

Some topics I am particularly interested in:

Picture of the UT Tower with "UT Austin Computational Linguistics" written in bigger font, and "Humans processing computers processing human processing language" in smaller font

3 10 18

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · 19d

No worries! Was just in NYC and figured it worth an ask. Thanks for the pointer.

Separately, would be great to catch up next time I’m around!

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · 20d

Open to non NYU-affiliates?

1 2

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Aug 15

Wholeheartedly pledging my allegiance to any and all other airlines

1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Aug 15

Breaking my years-long vow to never fly American Airlines just to be met with a 6 hr delay and 5am arrival back home 🫠

1 2

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Jul 10

But surely there is important novelty in answering both of those questions? Building a novel system/entity and generating a novel proof — inherent to that must be some new ideas by virtue of the questions not being previously answered.

I’m not sure I buy the idea that novelty has to be technical.

1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

We believe this work shows how mechanistic analyses can provide novel insights into syntactic structures — making good on the promise that studying LLMs can help us better understand linguistics by helping us develop linguistically interesting hypotheses!

📄: arxiv.org/abs/2505.16002

Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions

Large Language Models (LLMs) have emerged as powerful sources of evidence for linguists seeking to develop theories of syntax. In this paper, we argue that causal interpretability methods, applied to ...

arxiv.org

1 7

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

In our last experiment, we probe whether the mechanisms used to process single-clause variants of these constructions generalize to the matrix and embedded clauses of our multi-clause variants. However, we find little evidence of this transfer across our constructions.

1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

This begs the question: what drives constructions to take on these roles? We uncover that a combination of frequency and linguistic similarity is to blame. Namely, less frequent constructions utilize the mechanisms LMs have developed to deal with more frequent, linguistically similar constructions!

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

We then dive deeper, training interventions on individual constructions and evaluating them across all others, allowing us to build generalization networks. Network analysis reveals clear roles — some constructions act as sources, others as sinks.

1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

We first train interventions on n-1 constructions and test on all, including the held-out one.

Across all positions, we find above-chance transfer of mechanisms with significant positive transfer when the evaluated construction is in the train set, and when the train and eval animacy match.

1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

We use DAS to train interventions, localizing the processing mechanisms specific to given sets of filler-gaps. We then take these interventions, and evaluate them on other filler-gaps. Any observed causal effect duly suggests shared mechanisms across the constructions.

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

Our investigation focuses on 7 filler–gap constructions: 2 classes of embedded wh-questions, matrix-level wh-questions, restrictive relative clauses, clefts, pseudoclefts, & topicalization. For each construction, we make 4 templates split by animacy of the extraction and number of embedded clauses.

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · May 27

A key hypothesis in the history of linguistics is that different constructions share underlying structure. We take advantage of recent advances in mechanistic interpretability to test this hypothesis in Language Models.

New work with @kmahowald.bsky.social and @cgpotts.bsky.social!

🧵👇!

1 6 23

Reposted by Sasha Boguraev @ COLM

Qing Yao @qyao.bsky.social · Mar 31

LMs learn argument-based preferences for dative constructions (preferring recipient first when it’s shorter), consistent with humans. Is this from memorizing preferences in training? New paper w/ @kanishka.bsky.social , @weissweiler.bsky.social , @kmahowald.bsky.social

arxiv.org/abs/2503.20850

examples from direct and prepositional object datives with short-first and long-first word orders:
DO (long first): She gave the boy who signed up for class and was excited it.
PO (short first): She gave it to the boy who signed up for class and was excited.
DO (short first): She gave him the book that everyone was excited to read.
PO (long-first): She gave the book that everyone was excited to read to him.

1 8 18

Reposted by Sasha Boguraev @ COLM

Siyuan Song✈️COLM @siyuansong.bsky.social · Mar 12

New preprint w/ @jennhu.bsky.social @kmahowald.bsky.social : Can LLMs introspect about their knowledge of language?
Across models and domains, we did not find evidence that LLMs have privileged access to their own predictions. 🧵(1/8)

2 16 60

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Feb 20

Do you have any thoughts on whether these a) emerged naturally during the RL phase of training (rather than being specifically engineered to encourage more generation or an artifact of some other post-training phase) and if so b) actually represent backtracking in the search?

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Feb 20

I'm curious as to what you think of the explicit backtracking in the reasoning model's chains of thoughts? I agree that much of the CoT feels odd and unfaithful, but also there's something that feels very easily anthromorphizable in the various “oh wait”s, and “now I see”s.

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Jan 6

Been spending sometime over break making my way through the Bayesian Models of Cognition book — great read.

1 1

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Dec 15

Notoriously finicky BC weather celebrating the last day of #NeurIPS2024 with a rainbow across the harbor

5

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Dec 14

I'll be presenting a position piece (arxiv.org/abs/2409.17005) on what cognitive science and linguistics can bring to the Math + AI field tomorrow from 11:00 -12:30 and 4:00 - 5:00 at the (aptly named) #NeurIPS2024 MathAI Workshop in West Meeting Room 118-120. Come say hi and hear about my work!

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflect...

arxiv.org

5 17

Reposted by Sasha Boguraev @ COLM

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · Dec 13

In Vancouver for #NeurIPS2024 workshops! At Math-AI tomorrow @sashaboguraev.bsky.social is presenting our experiment-infused position piece on the communicative nature of math and why that matters for AI arxiv.org/pdf/2409.17005. Say hi!

Will be better than the Panthers 4-0 loss to the Canucks.

arxiv.org

1 8

Sasha Boguraev @ COLM @sashaboguraev.bsky.social · Dec 10

I’m at NeurIPS all week! On 12/14 I’ll present on viewing math as a communicative activity at the MathAI workshop. Meanwhile, I’d love to chat about this work or more broadly about communicative framing in NLP, including emergent communication paradigms, language games and more. Please reach out!

2