Nishant Subramani @ ACL
@nsubramani23.bsky.social
1.4K followers 510 following 23 posts
PhD student @CMU LTI - working on model #interpretability, student researcher @google; prev predoc @ai2; intern @MSFT nishantsubramani.github.io
Posts Media Videos Starter Packs
Pinned
nsubramani23.bsky.social
👏🏽 Intro

💼 PhD student @ltiatcmu.bsky.social

📜 My research is in model interpretability 🔎, understanding the internals of LLMs to build more controllable and trustworthy systems

🫵🏽 If you are interested in better understanding of language technology or model interpretability, let's connect!
nsubramani23.bsky.social
At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025
nsubramani23.bsky.social
Excited to be attending NEMI in Boston today to present 🐁 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools and co-moderate the model steering and control roundtable! Come find me to connect and chat about steering and actionable interp
nsubramani23.bsky.social
At #ACL2025 in Vienna 🇦🇹 till next Saturday! Love to chat about anything #interpretability 🔎, understanding model internals 🔬, and finding yummy vegan food 🥬
nsubramani23.bsky.social
At #ICML2025 🇨🇦 till Sunday! Love to chat about #interpretability, understanding model internals, and finding yummy vegan food in Vancouver 🥬🍜
Reposted by Nishant Subramani @ ACL
🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models
nsubramani23.bsky.social
🚨 Check out our new #interpretability paper: 🕵🏽 Model Internal Sleuthing led by the amazing @bearseascape.bsky.social who is an undergrad at @scsatcmu.bsky.social @ltiatcmu.bsky.social
🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models
nsubramani23.bsky.social
Excited to announce that I started at @googleresearch.bsky.social on the cloud team as a student researcher last month working with Hamid Palangi on actionable #interpretability 🔍 to build better tool using #agents ⚒️🤖
nsubramani23.bsky.social
Presenting this today at the poster session at #NAACL2025!

Come chat about interpretability, trustworthiness, and tool-using agents!

🗓️ - Thursday May 1st (today)
📍 - Hall 3
🕑 - 200-330pm
nsubramani23.bsky.social
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
nsubramani23.bsky.social
At #NAACL2025 🌵till Sunday! Love to chat about interpretability, understanding model internals, and finding vegan food 🥬
nsubramani23.bsky.social
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
nsubramani23.bsky.social
Come to our poster in Albuquerque on Thursday 2-330pm in the interpretability & analysis section!

Paper: aclanthology.org/2025.naacl-l...
Code (coming soon): github.com/microsoft/mi...

🧵/🧵
nsubramani23.bsky.social
MICE 🐭:
🎯 - significantly beats baselines on expected tool-calling utility, especially in high risk scenarios
✅ - matches expected calibration error of baselines
✅ - is sample efficient
✅ - generalizes zeroshot to unseen tools

5/🧵
nsubramani23.bsky.social
Calibration is not sufficient: both an oracle and a model that just predicts the base rate are perfectly calibrated🤦🏽‍♂️

We develop a new metric expected tool-calling utility 🛠️to measure the utility of deciding whether or not to execute a tool call via a confidence score!

4/🧵
nsubramani23.bsky.social
We propose 🐭 MICE to better assess confidence when calling tools:

1️⃣ decode from each intermediate layer of an LM
2️⃣ compute similarity scores between each layer’s generation and the final output.
3️⃣ train a probabilistic classifier on these features

3/🧵
nsubramani23.bsky.social
1️⃣ Tool-using agents need to be useful and safe as they take actions in the world
2️⃣ Language models are poorly calibrated

🤔 Can we use model internals to better calibrate language models to make tool-using agents safer and more useful?

2/🧵
nsubramani23.bsky.social
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
Reposted by Nishant Subramani @ ACL
nsubramani23.bsky.social
👏🏽 Intro

💼 PhD student @ltiatcmu.bsky.social

📜 My research is in model interpretability 🔎, understanding the internals of LLMs to build more controllable and trustworthy systems

🫵🏽 If you are interested in better understanding of language technology or model interpretability, let's connect!
nsubramani23.bsky.social
1) I'm working on using intermediate model generations with LLMs to better calibrate tool using agents ⚒️🤖 than the probabilities themselves! Turns out you can 🥳

2) There's gotta be a nice geometric understanding of what's going on within LLMs when we tune them 🤔
lastpositivist.bsky.social
Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. I came to hate my work and thinking so don't do it anymore.
2.
etvpod.bsky.social
Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. Convincing everyone that everything is luck, all the way down.

2. LLM’s can reason and understand in the external sense.