Lightnews — Scholar-powered news

Tal Haklay

@talhaklay.bsky.social

56 followers 330 following 28 posts

NLP | Interpretability | PhD student at the Technion

Posts Replies Media Videos

Pinned

Tal Haklay @talhaklay.bsky.social · Mar 6

1/13 LLM circuits tell us where the computation happens inside the model—but the computation varies by token position, a key detail often ignored!
We propose a method to automatically find position-aware circuits, improving faithfulness while keeping circuits compact. 🧵👇

Tal Haklay

@talhaklay.bsky.social

Our paper "Position-Aware Automatic Circuit Discovery" got accepted to ACL! 🎉

Huge thanks to my collaborators🙏
@hadasorgad.bsky.social
@davidbau.bsky.social
@amuuueller.bsky.social
@boknilev.bsky.social

See you in Vienna! 🇦🇹 #ACL2025 @aclmeeting.bsky.social

May 22, 2025 at 8:11 AM

Reposted by Tal Haklay

Actionable Interpretability Workshop ICML2025

@actinterp.bsky.social

🚨 We're looking for more reviewers for the workshop!
📆 Review period: May 24-June 7

If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input.

💡🔍 Self-nominate here:
docs.google.com/forms/d/e/1F...

May 20, 2025 at 12:05 AM

Tal Haklay

@talhaklay.bsky.social

We knew many of you wanted to submit to our Actionable Interpretability workshop, but we didn’t expect to crash Overleaf! 😏🍃

Only 5 days left ⏰!
Got a paper accepted to ICML that fits our theme?
Submit it to our conference track!
👉 @actinterp.bsky.social

May 14, 2025 at 1:04 PM

Reposted by Tal Haklay

Aaron Mueller

@amuuueller.bsky.social

This was a huge collaboration with many great folks! If you get a chance, be sure to talk to Atticus Geiger, @sarah-nlp.bsky.social, @danaarad.bsky.social, Iván Arcuschin, @adambelfki.bsky.social, @yiksiu.bsky.social, Jaden Fiotto-Kaufmann, @talhaklay.bsky.social, @michaelwhanna.bsky.social, ...

April 23, 2025 at 6:15 PM

Tal Haklay

@talhaklay.bsky.social

🚨 Call for Papers is Out!

The First Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at ICML 2025 in Vancouver!

📅 Submission Deadline: May 9
Follow us >> @ActInterp

🧠Topics of interest include: 👇

April 7, 2025 at 1:51 PM

Tal Haklay

@talhaklay.bsky.social

Amazing news: our workshop was accepted to ICML 2025!

Interpretability research sheds light on how models work—but too often, those insights don’t translate into actions that improve them.
Our workshop aims to challenge the interpretability community to go further.

Mor Geva @megamor2.bsky.social · Mar 31

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉
> Follow @actinterp.bsky.social
> Website actionable-interpretability.github.io

@talhaklay.bsky.social @anja.re @mariusmosbach.bsky.social @sarah-nlp.bsky.social @iftenney.bsky.social

Paper submission deadline: May 9th!

March 31, 2025 at 6:29 PM

Tal Haklay

@talhaklay.bsky.social

March 6, 2025 at 10:15 PM

Reposted by Tal Haklay

Martin Tutek

@mtutek.bsky.social

🚨🚨 New preprint 🚨🚨

Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?

We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.

arxiv.org/abs/2502.14829

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...

arxiv.org

February 21, 2025 at 12:43 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news