Lightnews — Scholar-powered news

Reposted by Manya Wadhwa

Kyle Mahowald (COLM 2025) @kmahowald.bsky.social · 21h

UT Austin Linguistics is hiring in computational linguistics!

Asst or Assoc.

We have a thriving group sites.utexas.edu/compling/ and a long proud history in the space. (For instance, fun fact, Jeff Elman was a UT Austin Linguistics Ph.D.)

faculty.utexas.edu/career/170793

🤘

UT Austin Computational Linguistics Research Group – Humans processing computers processing humans processing language

sites.utexas.edu

1 16 26

Reposted by Manya Wadhwa

Greg Durrett @gregdnlp.bsky.social · 1d

Find my students and collaborators at COLM this week!

Tuesday morning: @juand-r.bsky.social and @ramyanamuduri.bsky.social 's papers (find them if you missed it!)

Wednesday pm: @manyawadhwa.bsky.social 's EvalAgent

Thursday am: @anirudhkhatry.bsky.social 's CRUST-Bench oral spotlight + poster

5 8

Manya Wadhwa @manyawadhwa.bsky.social · 18h

Unfortunately I won't be at #COLM2025 this week, but please check out our work being presented by my collaborators/advisors!

If you are interested in evals of open-ended tasks/creativity please reach out and we can schedule a chat! :)

Greg Durrett @gregdnlp.bsky.social · 1d

Find my students and collaborators at COLM this week!

Tuesday morning: @juand-r.bsky.social and @ramyanamuduri.bsky.social 's papers (find them if you missed it!)

Wednesday pm: @manyawadhwa.bsky.social 's EvalAgent

Thursday am: @anirudhkhatry.bsky.social 's CRUST-Bench oral spotlight + poster

1 2

Reposted by Manya Wadhwa

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · 1d

Excited to present this at #COLM2025 tomorrow! (Tuesday, 11:00 AM poster session)

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Apr 16

One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇

A visualization of the generator-validator gap, where the LM likelihoods of for the generator and discriminator forms of questions are poorly correlated.

Aligning the validator and generator rankings can fix it!

4 10

Reposted by Manya Wadhwa

Marzena Karpinska ✈️ COLM'25 @markar.bsky.social · 1d

Come to talk with us today about the evaluation of long form multilingual generation at the second poster session #COLM2025

📍4:30–6:30 PM / Room 710 – Poster #8

2 6

Reposted by Manya Wadhwa

Chaitanya Malaviya @cmalaviya.bsky.social · Jun 6

Ever wondered what makes language models generate overly verbose, vague, or sycophantic responses?

Our new paper investigates these and other idiosyncratic biases in preference models, and presents a simple post-training recipe to mitigate them! Thread below 🧵↓

1 3 10

Reposted by Manya Wadhwa

Elias Stengel-Eskin @esteng.bsky.social · May 5

Extremely excited to announce that I will be joining
@utaustin.bsky.social Computer Science in August 2025 as an Assistant Professor! 🎉

5 9 43

Reposted by Manya Wadhwa

Vishakh Padmakumar @vishakhpk.bsky.social · Apr 29

What does it mean for #LLM output to be novel?
In work w/ johnchen6.bsky.social, Jane Pan, Valerie Chen and He He, we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

2 4 7

Reposted by Manya Wadhwa

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Nov 8

How do language models organize concepts and their properties? Do they use taxonomies to infer new properties, or infer based on concept similarities? Apparently, both!

🌟 New paper with my fantastic collaborators @amuuueller.bsky.social and @kanishka.bsky.social

Title: "Characterizing the Role of Similarity in the Property Inferences of Language Models"
Authors: Juan Diego Rodriguez, Aaron Mueller, Kanishka Misra

Left figure: "Given that dogs are daxable, is it true that corgis are daxable?" A language model could answer this either using taxonomic relations, illustrated by a taxonomy dog-corgi, dog-mutt, canine-wolf, etc., or by similarity relations (dogs are more similar to corgis than cats, wolves or shar peis).

Right figure: illustration of the causal model (and an example intervention) for distributed alignment search (DAS), which we used to find a subspace in the network responsible for property inheritance behavior. The bottom nodes are "property", "premise concept (A)" and "conclusion concept (B)" , the middle nodes are "A has property P", "B is a kind of A", and the top node is "B has property P".

4 22 110

Reposted by Manya Wadhwa

Kanishka Misra 🌊 @kanishka.bsky.social · Apr 28

If you are at #NAACL2025 @naaclmeeting.bsky.social catch @juand-r.bsky.social presenting our poster on the interplay between similarity and category membership in the property inferences of LMs @ Poster Session 1 on Wednesday!

Or if you're at home like me, read our paper: arxiv.org/abs/2410.22590

Juan Diego Rodriguez (@ COLM 2025) @juand-r.bsky.social · Nov 8

How do language models organize concepts and their properties? Do they use taxonomies to infer new properties, or infer based on concept similarities? Apparently, both!

🌟 New paper with my fantastic collaborators @amuuueller.bsky.social and @kanishka.bsky.social

2 12

Reposted by Manya Wadhwa

Anirudh Khatry @anirudhkhatry.bsky.social · Apr 23

🚀Meet CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️
A dataset of 100 real-world C repositories across various domains, each paired with:
🦀 Handwritten safe Rust interfaces.
🧪 Rust test cases to validate correctness.
🧵[1/6]