Stella Li
@stellali.bsky.social
1.3K followers 200 following 51 posts
PhD student @uwnlp.bsky.social @uwcse.bsky.social | visiting researcher @MetaAI | previously @jhuclsp.bsky.social https://stellalisy.com
Posts Media Videos Starter Packs
Pinned
stellali.bsky.social
WHY do you prefer something over another?

Reward models treat preference as a black-box😶‍🌫️but human brains🧠decompose decisions into hidden attributes

We built the first system to mirror how people really make decisions in our recent COLM paper🎨PrefPalette✨

Why it matters👉🏻🧵
Reposted by Stella Li
maartensap.bsky.social
Day 1 (Tue Oct 7) 4:30-6:30pm, Poster Session 2

Poster #77: ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning; led by
@stellali.bsky.social & @jiminmun.bsky.social
stellali.bsky.social
This project was done as part of the Meta FAIR AIM mentorship program. Special thanks to my amazing collaborators and awesome mentors @melaniesclar.bsky.social @jcqln_h @hunterjlang @AnsongNi @andrew_e_cohen @jacoby_xu @chan_young_park @tsvetshop.bsky.social ‪@asli-celikyilmaz.bsky.social‬ 🫶🏻💙
stellali.bsky.social
🌍Bonus: PrefPalette🎨 is a computational social science goldmine!

📊 Quantify community values at scale
📈 Track how norms evolve over time
🔍 Understand group psychology
📋 Move beyond surveys to revealed preferences
stellali.bsky.social
💡Potential real-world applications:

🛡️Smart content moderation—explains why content is flagged/decisions are made

🎯Interpretable LM alignment—revealing prominent attributes

⚙️Controllable personalization—giving user agency to personalize select attributes
stellali.bsky.social
🔍More importantly‼️we can see WHY preferences differ:

r/AskHistorians:📚values verbosity
r/RoastMe:💥values directness
r/confession:❤️values empathy

We visualize each group’s unique preference decisions—no more one-size-fits-all. Understand your audience at a glance🏷️
stellali.bsky.social
🏆Results across 45 Reddit communities:

📈Performance boost: +46.6% vs GPT-4o
💪Outperforms other training-based baselines w/ statistical significance
🕰️Robust to temporal shifts—trained pref models can be used out-of-the box!
stellali.bsky.social
⚙️How it works (pt.2)

1: 🎛️Train compact, efficient detectors for every attribute

2: 🎯Learn community-specific attribute weights during preference training

3: 🔧Add attribute embeddings to preference model for accurate & explainable predictions
stellali.bsky.social
⚙️How it works (prep stage)

📜Define 19 sociolinguistics & cultural attributes from literature
🏭Novel preference data generation pipeline to isolate attributes

Our data gen pipeline generates pairwise data on *any* decomposed dimension, w/ applications beyond preference modeling
stellali.bsky.social
Meet PrefPalette🎨! Our approach:

🔍⚖️models preferences w/ 19 attribute detectors and dynamic, context-aware weights

🕶️👍uses unobtrusive signals from Reddit to avoid response bias

🧠mirrors attribute-mediated human judgment—so you know not just what it predicts, but *why*🧐
stellali.bsky.social
🔬Cognitive science reveals how humans break choices into attributes, e.g.:

😂 Humor
❤️ Empathy
💬 Conformity
...then weight them based on context (e.g. comedy vs counseling).

These traits shape every decision, from product picks to conversation tone. Your mind is a colorful palette🎨
stellali.bsky.social
🚨Current preference models only output a reward/score:

❌No transparency in decision-making
❌Personalization breaks easily, one-size-fits-all scores
❌Use explicit annotations (response bias)

They can’t adapt to individual tastes, can’t debug errors, and fail to build trust🙅
stellali.bsky.social
WHY do you prefer something over another?

Reward models treat preference as a black-box😶‍🌫️but human brains🧠decompose decisions into hidden attributes

We built the first system to mirror how people really make decisions in our recent COLM paper🎨PrefPalette✨

Why it matters👉🏻🧵
Reposted by Stella Li
esfrankel.bsky.social
Want to quickly sample high-quality images from diffusion models, but can’t afford the time or compute to distill them? Introducing S4S, or Solving for the Solver, which learns the coefficients and discretization steps for a DM solver to improve few-NFE generation.

Thread 👇 1/
stellali.bsky.social
This work was jointly done with the amazing @jiminmun.bsky.social !
And huge shout out to our awesome collaborators and mentors faebrahman.bsky.social, Jonathan Ilgen, Yulia (tsvetshop.bsky.social) and maartensap.bsky.social 🩵🥰
Faeze Brahman (@faebrahman.bsky.social)
faebrahman.bsky.social
stellali.bsky.social
Why this matters for AI safety & reliability: 🛡️

Better information gathering = Better decisions✅
Proactive questioning = Fewer blind spots🧐
Structured attributes = More controllable AI🤖
Interactive systems = More natural AI assistants🫶🏻
stellali.bsky.social
ALFA isn't just for medicine! The framework could be adapted to ANY field where proactive information gathering matters:

Legal consultation ⚖️
Financial advising 💰
Educational tutoring 📚
Investigative journalism 🕵️

Anywhere an AI needs to ask (not just answer), you should try ALFA out!🌟
stellali.bsky.social
🌟 Impressive Generalization!
ALFA-trained models maintain strong performance even on completely new interactive medical tasks (MediQ-MedQA).
highlighting ALFA’s potential for broader applicability in real-world clinical scenarios‼️
stellali.bsky.social
🔬 Key Finding #2: Every Attribute Matters!

Removing any single attribute hurts performance‼️

Grouping general (clarify, focus, answerability) vs. clinical (medical accuracy, diagnostic relevance, avoiding DDX bias) attributes leads to drastically different outputs👩‍⚕️

Check out some cool examples!👇
stellali.bsky.social
🔬 Key Finding #1: Preference Learning > Supervised Learning

Is it just good synthetic data❓ No❗️

Simply showing good examples isn't enough! Models need to learn directional differences between good and bad questions.

(but only SFT no DPO also doesn't work!)
stellali.bsky.social
Results show ALFA’s strengths🚀
ALFA-aligned models achieve:
⭐️56.6% reduction in diagnostic errors🦾
⭐️64.4% win rate in question quality✅
⭐️Strong generalization.
in comparison with baseline SoTA instruction-tuned LLMs.
stellali.bsky.social
The secret sauce of ALFA? 🔍
6 key attributes from theory (cognitive science, medicine):
General:
- Clarity ✨
- Focus 🎯
- Answerability 💭
Clinical:
- Medical Accuracy 🏥
- Diagnostic Relevance 🔬
- Avoiding Bias ⚖️
Each attribute contributes to different aspect of the complex goal of question asking!
stellali.bsky.social
📚 Exciting Dataset Release: MediQ-AskDocs!
17k real clinical interactions
80k attribute-specific question variations
302 expert-annotated scenarios
Perfect for research on interactive medical AI
First major dataset for training & evaluating medical question-asking! 🎯
huggingface.co/datasets/ste...
stellalisy/MediQ_AskDocs_preference · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
stellali.bsky.social
Introducing ALFA: ALignment via Fine-grained Attributes 🎓
A systematic, general question-asking framework that:
1️⃣ Decomposes the concept of good questioning into attributes📋
2️⃣ Generates targeted attribute-specific data📚
3️⃣ Teaches LLMs through preference learning🧑‍🏫