Lightnews — Scholar-powered news

Reposted by Siva Reddy

Gaurav Kamath

@grvkamath.bsky.social

Our new paper in #PNAS (bit.ly/4fcWfma) presents a surprising finding—when words change meaning, older speakers rapidly adopt the new usage; inter-generational differences are often minor.

w/ Michelle Yang, ‪@sivareddyg.bsky.social‬ , @msonderegger.bsky.social‬ and @dallascard.bsky.social‬👇(1/12)

July 29, 2025 at 12:06 PM

Siva Reddy

@sivareddyg.bsky.social

Age doesn't matter to pick up new word usages. The pronunciation may sound odd across generations but not the semantics 👴👵👨👩

Dallas Card @dallascard.bsky.social · Jul 29

I am delighted to share our new #PNAS paper, with @grvkamath.bsky.social @msonderegger.bsky.social and @sivareddyg.bsky.social, on whether age matters for the adoption of new meanings. That is, as words change meaning, does the rate of adoption vary across generations? www.pnas.org/doi/epdf/10....

July 29, 2025 at 4:52 PM

Reposted by Siva Reddy

VLMs4All - CVPR 2025 Workshop

@vlms4all.bsky.social

🗓️ Save the date! It's official: The VLMs4All Workshop at #CVPR2025 will be held on June 12th!

Get ready for a full day of speakers, posters, and a panel discussion on making VLMs more geo-diverse and culturally aware 🌐

Check out the schedule below!

June 6, 2025 at 9:19 AM

Siva Reddy

@sivareddyg.bsky.social

Incredibly proud of my students @adadtur.bsky.social and Gaurav Kamath for winning a SAC award at #NAACL2025 for their work on assessing how LLMs model constituent shifts.

May 1, 2025 at 3:11 PM

Reposted by Siva Reddy

Benno Krojer

@bennokrojer.bsky.social

Great work from labmates on LLMs vs humans regarding linguistic preferences: You know when a sentence kind of feels off e.g. "I met at the park the man". So in what ways do LLMs follow these human intuitions?

Mila - Institut québécois d'IA @mila-quebec.bsky.social · May 1

Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670

May 1, 2025 at 3:04 PM

Siva Reddy

@sivareddyg.bsky.social

List of #SafetyGuaranteedLLMs talks on Monday Apr 14 2025 PDT. Speakers @rogergrosse.bsky.social Boaz Barak, Ethan Perez, Georgios Piliouras

April 14, 2025 at 5:44 AM

Siva Reddy

@sivareddyg.bsky.social

The most exciting event on LLM safety is happening this week at @simonsinstitute.bsky.social with many excellent speakers. Organized by @yoshuabengio.bsky.social et al. Join us in person or virtual. In collaboration with @ivado.bsky.social. More details here:

simons.berkeley.edu/workshops/sa...

April 14, 2025 at 5:41 AM

Reposted by Siva Reddy

Simons Institute for the Theory of Computing

@simonsinstitute.bsky.social

Though in-person registration is now full, you can still register to view the private livestream for next week's workshop on Safety-Guaranteed LLMs, co-organized with @ivado.bsky.social. We'll be posting live here as well.

simons.berkeley.edu/workshops/sa...

April 11, 2025 at 4:43 AM

Reposted by Siva Reddy

Benno Krojer

@bennokrojer.bsky.social

Never been part of a project like this before - it was a very rewarding+unique experience!

Everyone in the lab contributed different chapters and it was much more exploratory than your average phd project.

My chapter studied R1's reasoning on "image generation/editing" (via ASCII) 🧵👇

1/N

Sara Vera Marjanovic @saravera.bsky.social · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: §3 Analysis of Reasoning Chains (central cloud), §4 Scaling of Thoughts (discussing thought length and performance metrics), §5 Long Context Evaluation (focusing on information recall), §6 Faithfulness to Context (examining question answering accuracy), §7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), §8 Language & Culture (exploring moral reasoning and language effects), §9 Relation to Human Processing (comparing cognitive processes), §10 Visual Reasoning (covering ASCII generation capabilities), and §11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.

April 1, 2025 at 9:19 PM

Siva Reddy

@sivareddyg.bsky.social

I will be giving a talk about this work @SimonsInstitute tomorrow (Apr 2nd 3PM PT). Join us, both in-person or virtually.

simons.berkeley.edu/workshops/fu...

April 1, 2025 at 8:16 PM

Siva Reddy

@sivareddyg.bsky.social

Introducing the DeepSeek-R1 Thoughtology -- the most comprehensive study of R1 reasoning chains/thoughts ✨. Probably everything you need to know about R1 thoughts. If we missed something, please let us know.

Sara Vera Marjanovic @saravera.bsky.social · Apr 1

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

April 1, 2025 at 8:12 PM

Reposted by Siva Reddy

Conference on Language Modeling

@colmweb.org

A bit of a mess around the conflict of COLM with the ARR (and to lesser degree ICML) reviews release. We feel this is creating a lot of pressure and uncertainty. So, we are pushing our deadlines:

Abstracts due March 22 AoE (+48hr)
Full papers due March 28 AoE (+24hr)

Plz RT 🙏

March 20, 2025 at 6:20 PM

Reposted by Siva Reddy

Benno Krojer

@bennokrojer.bsky.social

As someone who has tried to make even basic image editing work in my research (e.g. "move cup to left of table"):
Gemini's new editing capabilities are seriously impressive!

Playing around with it is quite fun...
Edit 1: "edit the image to contain 3 more people"

March 18, 2025 at 3:48 PM

Siva Reddy

@sivareddyg.bsky.social

Why do LLMs have a hard time aligning, while humans are better at it? 🌟The answer lies in the lack of a societal alignment framework for LLMs 🌍.

Incredible effort by @karstanczak.bsky.social in pulling views from multiple disciplines and experts in these fields.

arxiv.org/abs/2503.00069

Karolina Stańczak @karstanczak.bsky.social · Mar 4

📢New Paper Alert!🚀

Human alignment balances social expectations, economic incentives, and legal frameworks. What if LLM alignment worked the same way?🤔

Our latest work explores how social, economic, and contractual alignment can address incomplete contracts in LLM alignment🧵

March 4, 2025 at 5:22 PM

Siva Reddy

@sivareddyg.bsky.social

How to Get Your LLM to Generate Challenging
Problems for Evaluation? 🤔 Check out our CHASE recipe. A highly relevant problem given that most human-curated datasets are crushed within days.

Arkil Patel @arkil.bsky.social · Feb 21

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨

Work w/ fantastic advisors Dima Bahdanau and @sivareddyg.bsky.social

Thread 🧵:

February 21, 2025 at 6:53 PM

Reposted by Siva Reddy

Benno Krojer

@bennokrojer.bsky.social

Finally it's handy that all my twitter posts got migrated here to bsky:

I'll be presenting AURORA at @neuripsconf.bsky.social on Wednesday!

Come by to discuss text-guided editing (and why imo it is more interesting than image generation), world modeling, evals and vision-and-language reasoning

Benno Krojer @bennokrojer.bsky.social · Sep 26

AURORA 🌌 is now accepted as a Spotlight at NeurIPS 🥂

We wondered if a model can do *controlled* video generation but in a *single* step?

So we built a dataset+model for “taking actions” on images via editing, or what you could call single-step controlled video gen

Benno Krojer @bennokrojer.bsky.social · Jul 9

Did you miss the recent Auroras? No problem! ✨🎆

Super excited to share AURORA, a *general* image editing model + high-quality data that improves where prev work fails the most:
Performing *action or movement* edits, i.e. a kind of world model setup

Insights/Details ⬇️

December 8, 2024 at 6:13 PM

Siva Reddy

@sivareddyg.bsky.social

Congratulations
@andreasmadsen.bsky.social
on successfully defending your PhD ⚔️ 🎉🎉 Grateful to you for stretching my interests into interpretability and engaging me with exciting deas. Good luck with your mission on building faithfully interpretable models.

Andreas Madsen @andreasmadsen.bsky.social · Nov 28

I’m thrilled to share that I’ve finished my Ph.D. at Mila and Polytechnique Montreal. For the last 4.5 years, I have worked on creating new faithfulness-centric paradigms for NLP Interpretability. Read my vision for the future of interpretability in our new position paper: arxiv.org/abs/2405.05386

Interpretability Needs a New Paradigm

Interpretability is the study of explaining models in understandable terms to humans. At present, interpretability is divided into two paradigms: the intrinsic paradigm, which believes that only model...

arxiv.org

November 29, 2024 at 6:25 PM

Reposted by Siva Reddy

Apoorv Khandelwal

@apoorvkh.com

“Turn” a decoder into an encoder with LLM2Vec (github.com/McGill-NLP/l...). Seen at COLM 2024 :)

If you want the naive, training-free / model-agnostic approach: their related work section says it is most common to using the final token’s last hidden state.

GitHub - McGill-NLP/llm2vec: Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders' - McGill-NLP/llm2vec

github.com

November 26, 2024 at 1:37 AM

Siva Reddy

@sivareddyg.bsky.social

Stages of #ICLR reviewing:
Stage 1: 😍 I hope I learn something new
Stage 2: 🤗 I hope I am constructive enough while being critical. Submits review
Stage 3: 🤯 Receives 5 page response + revision with many new pages
Stage 4: 😱 Crap, how do I get out of this?
Stage 5: 😵‍💫 What year is it?

November 26, 2024 at 5:08 AM

Reposted by Siva Reddy

Ofir Press

@ofirpress.bsky.social

I wrote some thoughts on how to build good LM benchmarks: ofir.io/How-to-Build...

How to Build Good Language Modeling Benchmarks

Building benchmarks is important because they shine a spotlight on the weaknesses of existing language models and so can guide the community on how to improve them.

ofir.io

November 25, 2024 at 9:54 PM

Reposted by Siva Reddy

Niclas Overby Ⓝ

@overby.me

@sivareddyg.bsky.social Which platforms? Maybe consider @buffer.com

November 24, 2024 at 1:40 AM

Siva Reddy

@sivareddyg.bsky.social

It's beautiful to start from scratch sometimes 😇

November 24, 2024 at 1:28 AM

Reposted by Siva Reddy

Orion Weller

@orionweller.bsky.social

Creating a 🦋 starter pack for people working in IR/RAG: go.bsky.app/88ULgwY

I can’t seem to find everyone though, help definitely appreciated to fill this out (DM or comment)!

November 23, 2024 at 9:19 PM