Lightnews — Scholar-powered news

Dang Nguyen @divingwithorcas.bsky.social · 11h

On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions

Understanding and mitigating biases is critical for the adoption of large language models (LLMs) in high-stakes decision-making. We introduce Admissions and Hiring, decision tasks with hypothetical ap...

arxiv.org

Dang Nguyen @divingwithorcas.bsky.social · 11h

📣 Announcing our poster session at COLM 2025:

On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions

I will talk about biases in LLMs and how to mitigate them. Come say hi!

Poster #43, 4:30 PM

1 2 1

Reposted by Dang Nguyen

Dallas Card @dallascard.bsky.social · 6d

This game from UChicago is incredible! It might be a bit painful to play, especially for those of us who already spend too much time on email, but the concept and execution are brilliant!

Dang Nguyen @divingwithorcas.bsky.social · 12d

HR Simulator™: a game where you gaslight, deflect, and “let’s circle back” your way to victory.
Every email a boss fight, every “per my last message” a critical hit… or maybe you just overplayed your hand 🫠
Can you earn Enlightened Bureaucrat status?

(link below!)

1 2

Dang Nguyen @divingwithorcas.bsky.social · 10d

and yes, you can play on your mobile browser: hrsimulator.communicationgames.ai

HR Simulator™: Be the Person You Hate

A game that will change how you write emails.

hrsimulator.communicationgames.ai

Dang Nguyen @divingwithorcas.bsky.social · 10d

Playing HR Simulator™: think I'm getting on Brittany's good side

This is what she says about my attempt to get Dave to return to in-person work.

Any big tech company wanna hire me for HR? 👀

#HRSimulator #RoastedByBrittany

1

Dang Nguyen @divingwithorcas.bsky.social · 11d

Please use a VPN. We're sorry for any inconvenience!

Reposted by Dang Nguyen

Chicago Human+AI Lab @chicagohai.bsky.social · 12d

Home-grown at CHAI and
@uchicagoci.bsky.social
!! The first ever AI-driven game from academia 🎮Give it a go and let us know your rank on the leaderboard!

Dang Nguyen @divingwithorcas.bsky.social · 12d

HR Simulator™: a game where you gaslight, deflect, and “let’s circle back” your way to victory.
Every email a boss fight, every “per my last message” a critical hit… or maybe you just overplayed your hand 🫠
Can you earn Enlightened Bureaucrat status?

(link below!)

1 1

Dang Nguyen @divingwithorcas.bsky.social · 12d

Stay tuned for more on communication games! Big thanks to @ari-holtzman.bsky.social @Harvey Fu @chenhaotan.bsky.social @Peter West for making this project happen!

1

Dang Nguyen @divingwithorcas.bsky.social · 12d

hrsimulator.communicationgames.ai

We’re serious! Economic coordination happens via emails. How do humans fare against AIs in getting things done with words?

We see a genre co-emerging with LLMs: communication games, where communication is crucial and not just “cheap talk” like Mafia or Diplomacy.

HR Simulator™: Be the Person You Hate

A game that will change how you write emails.

hrsimulator.communicationgames.ai

1 2

Dang Nguyen @divingwithorcas.bsky.social · 12d

HR Simulator™: a game where you gaslight, deflect, and “let’s circle back” your way to victory.
Every email a boss fight, every “per my last message” a critical hit… or maybe you just overplayed your hand 🫠
Can you earn Enlightened Bureaucrat status?

(link below!)

2 5 4

Reposted by Dang Nguyen

chenhaotan.bsky.social @chenhaotan.bsky.social · Jul 9

Prompting is our most successful tool for exploring LLMs, but the term evokes eye-rolls and grimaces from scientists. Why? Because prompting as scientific inquiry has become conflated with prompt engineering.

This is holding us back. 🧵and new paper with @ari-holtzman.bsky.social .

2 15 37

Reposted by Dang Nguyen

chenhaotan.bsky.social @chenhaotan.bsky.social · Jul 2

When you walk into the ER, you could get a doc:
1. Fresh from a week of not working
2. Tired from working too many shifts

@oziadias.bsky.social has been both and thinks that they're different! But can you tell from their notes? Yes we can! Paper @natcomms.nature.com www.nature.com/articles/s41...

1 11 26

Reposted by Dang Nguyen

Chicago Human+AI Lab @chicagohai.bsky.social · Jun 24

@chachachen.bsky.social @haokunliu.bsky.social @divingwithorcas.bsky.social present posters on human-AI decision making, hypothesis generation, interpretability and fairness at MMLS 2025!

3 6

Reposted by Dang Nguyen

chenhaotan.bsky.social @chenhaotan.bsky.social · Jun 23

Since @elenal3ai.bsky.social cannot make it, I presented the poster on concept incongruence: arxiv.org/abs/2505.14905

2 7

Reposted by Dang Nguyen

Xiaoyan Bai @elenal3ai.bsky.social · May 27

🚨 New paper alert 🚨

Ever asked an LLM-as-Marilyn Monroe who the US president was in 2000? 🤔 Should the LLM answer at all? We call these clashes Concept Incongruence. Read on! ⬇️

1/n 🧵

1 17 28

Reposted by Dang Nguyen

Mingxuan (Aldous) Li @itea1001.bsky.social · May 12

1/n 🚀🚀🚀 Thrilled to share our latest work🔥: HypoEval - Hypothesis-Guided Evaluation for Natural Language Generation! 🧠💬📊
There’s a lot of excitement around using LLMs for automated evaluation, but many methods fall short on alignment or explainability — let’s dive in! 🌊

1 7 22

Reposted by Dang Nguyen

Mourad Heddaya @mheddaya.bsky.social · May 1

🧑‍⚖️How well can LLMs summarize complex legal documents? And can we use LLMs to evaluate?

Excited to be in Albuquerque presenting our paper this afternoon at @naaclmeeting 2025!

2 13 23

Reposted by Dang Nguyen

Haokun Liu @haokunliu.bsky.social · Apr 28

🚀🚀🚀Excited to share our latest work: HypoBench, a systematic benchmark for evaluating LLM-based hypothesis generation methods!

There is much excitement about leveraging LLMs for scientific hypothesis generation, but principled evaluations are missing - let’s dive into HypoBench together.

1 9 11

Reposted by Dang Nguyen

chenhaotan.bsky.social @chenhaotan.bsky.social · Apr 21

The Midwest Machine Learning Symposium will happen in Chicago on June 23-4 on the University of Chicago campus (midwest-ml.org/2025/). We have an amazing lineup of speakers:@profsanjeevarora.bsky.social from Princeton, Heng Ji from UIUC, Tuomas Sandholm from CMU, @ravenben.bsky.social from UChicago.

4 3

Reposted by Dang Nguyen

chenhaotan.bsky.social @chenhaotan.bsky.social · Apr 21

Encourage your students to submit posters and register! Limited free housing is provided for student participants only, on a first-come (i.e., request)-first-serve basis.

We are also actively looking for sponsors. Reach out if you are interested!

Please repost! Help spread the words!

chenhaotan.bsky.social @chenhaotan.bsky.social · Apr 21

The Midwest Machine Learning Symposium will happen in Chicago on June 23-4 on the University of Chicago campus (midwest-ml.org/2025/). We have an amazing lineup of speakers:@profsanjeevarora.bsky.social from Princeton, Heng Ji from UIUC, Tuomas Sandholm from CMU, @ravenben.bsky.social from UChicago.

2 10 10

Dang Nguyen @divingwithorcas.bsky.social · Apr 14

12/n

Big thanks to @chenhaotan.bsky.social for advice on the project, as well as helpful feedback from the wonderful members of the @chicagohai.bsky.social lab! Check out our code at github.com/ChicagoHAI/l....

DM me for any questions!

GitHub - ChicagoHAI/llm-prediction-bias

Contribute to ChicagoHAI/llm-prediction-bias development by creating an account on GitHub.

github.com

3

Dang Nguyen @divingwithorcas.bsky.social · Apr 14

11/n

So strangely, changing the prompt can change how a model represents race. Thus, in some cases, the model’s representation may be sensitive to spurious prompt features, which poses a challenge to the generalizability of debiasing methods. Future work on debiasing should take this into account.

1 2

Dang Nguyen @divingwithorcas.bsky.social · Apr 14

10/n

We found the race subspace generalizes cross-family (from admissions to hiring) and, to a lesser extent, cross-explicitness (from implicit race via name to explicit race), but it fails to generalize cross-prompt (from one prompt template to another).

1 1

Dang Nguyen @divingwithorcas.bsky.social · Apr 14

9/n

So we were able to debias via interventions on the race subspaces, but do they generalize? Here, the story gets more complicated.

1 1