Hiba Ahsan
@hibaahsan.bsky.social
31 followers 180 following 5 posts
PhD student @ Northeastern University, Clinical NLP https://hibaahsan.github.io/ she/her
Posts Media Videos Starter Packs
Reposted by Hiba Ahsan
On the Good Fight podcast w substack.com/@yaschamounk I give a quick but careful primer on how modern AI works.

I also chat about our responsibility as machine learning scientists, and what we need to fix to get AI right.

Take a listen and reshare -

www.persuasion.community/p/david-bau
David Bau on How Artificial Intelligence Works
Yascha Mounk and David Bau delve into the “black box” of AI.
www.persuasion.community
Reposted by Hiba Ahsan
Who is going to be at #COLM2025?

I want to draw your attention to a COLM paper by my student @sfeucht.bsky.social that has totally changed the way I think and teach about LLM representations. The work is worth knowing.

And you can meet Sheridan at COLM, Oct 7!
bsky.app/profile/sfe...
Reposted by Hiba Ahsan
sfeucht.bsky.social
[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.
Reposted by Hiba Ahsan
chantalsh.bsky.social
I'm searching for some comp/ling experts to provide a precise definition of “slop” as it refers to text (see: corp.oup.com/word-of-the-...)

I put together a google form that should take no longer than 10 minutes to complete: forms.gle/oWxsCScW3dJU...
If you can help, I'd appreciate your input! 🙏
Oxford Word of the Year 2024 - Oxford University Press
The Oxford Word of the Year 2024 is 'brain rot'. Discover more about the winner, our shortlist, and 20 years of words that reflect the world.
corp.oup.com
hibaahsan.bsky.social
4. Finally, we look at how such interventions can be used to detect implicit biases in clinical tasks. We mechanistically control gender/race and find that Olmo considers females to be at higher risk of depression than males, and Black patients to be at higher risk than white patients.
hibaahsan.bsky.social
3. Race is more complicated. We find multiple patches and are able to intervene to a degree.
hibaahsan.bsky.social
2. These patches generalize to non-clinical domains!
hibaahsan.bsky.social
1. We perform activation patching in the context of clinical vignette generation and find that gender information is highly localized. Patching MLP activations in a single layer consistently alters patient gender.
hibaahsan.bsky.social
LLMs are known to perpetuate social biases in clinical tasks. Can we locate and intervene upon LLM activations that encode patient demographics like gender and race? 🧵

Work w/ @arnabsensharma.bsky.social, @silvioamir.bsky.social, @davidbau.bsky.social, @byron.bsky.social

arxiv.org/abs/2502.13319