Xiaoyan Bai
@elenal3ai.bsky.social
430 followers 170 following 24 posts
PhD @UChicagoCS / BE in CS @Umich / ✨AI/NLP transparency and interpretability/📷🎨photography painting
Posts Media Videos Starter Packs
Pinned
elenal3ai.bsky.social
🚨 New paper alert 🚨

Ever asked an LLM-as-Marilyn Monroe who the US president was in 2000? 🤔 Should the LLM answer at all? We call these clashes Concept Incongruence. Read on! ⬇️

1/n 🧵
Reposted by Xiaoyan Bai
divingwithorcas.bsky.social
HR Simulator™: a game where you gaslight, deflect, and “let’s circle back” your way to victory.
Every email a boss fight, every “per my last message” a critical hit… or maybe you just overplayed your hand 🫠
Can you earn Enlightened Bureaucrat status?

(link below!)
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
🚀 We’re thrilled to announce the upcoming AI & Scientific Discovery online seminar! We have an amazing lineup of speakers.

This series will dive into how AI is accelerating research, enabling breakthroughs, and shaping the future of research across disciplines.

ai-scientific-discovery.github.io
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
As AI becomes increasingly capable of conducting analyses and following instructions, my prediction is that the role of scientists will increasingly focus on identifying and selecting important problems to work on ("selector"), and effectively evaluating analyses performed by AI ("evaluator").
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
We are proposing the second workshop on AI & Scientific Discovery at EACL/ACL. The workshop will explore how AI can advance scientific discovery. Please use this Google form to indicate your interest (corrected link):

forms.gle/MFcdKYnckNno...

More in the 🧵! Please share! #MLSky 🧠
Program Committee Interest for the Second Workshop on AI & Scientific Discovery
We are proposing the second workshop on AI & Scientific Discovery at EACL/ACL (Annual meetings of The Association for Computational Linguistics, the European Language Resource Association and Internat...
forms.gle
elenal3ai.bsky.social
⚡️Ever asked an LLM-as-Marilyn Monroe about the 2020 election? Our paper calls this concept incongruence, common in both AI and how humans create and reason.
🧠Read my blog to learn what we found, why it matters for AI safety and creativity, and what's next: cichicago.substack.com/p/concept-in...
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
Prompting is our most successful tool for exploring LLMs, but the term evokes eye-rolls and grimaces from scientists. Why? Because prompting as scientific inquiry has become conflated with prompt engineering.

This is holding us back. 🧵and new paper with @ari-holtzman.bsky.social .
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
When you walk into the ER, you could get a doc:
1. Fresh from a week of not working
2. Tired from working too many shifts

@oziadias.bsky.social has been both and thinks that they're different! But can you tell from their notes? Yes we can! Paper @natcomms.nature.com www.nature.com/articles/s41...
elenal3ai.bsky.social
Humbled to receive an honorable mention🌟
chenhaotan.bsky.social
Congratulations to all best poster awards and honorable mentions!
Reposted by Xiaoyan Bai
chenhaotan.bsky.social
Since @elenal3ai.bsky.social cannot make it, I presented the poster on concept incongruence: arxiv.org/abs/2505.14905
elenal3ai.bsky.social
i.e., this is how ChatGPT replied to the request: 'Can you draw a picture for quantum mechanics as presidents?'
elenal3ai.bsky.social
On the other hand, concept incongruence is a window to creativity, whether in art or in scientific discoveries. Combining seemingly conflicting concepts is critical to creating something genuinely new. Exploring behavior under concept incongruence may be key to unlocking unknown model capabilities!
elenal3ai.bsky.social
More broadly, clashing concepts embed contradictory signals into a model’s latent space. To address these collisions, LLM developers need to tackle the specification problem. We believe there are important open questions in identifying incongruence and improving the specification.
elenal3ai.bsky.social
In addition to abstention behavior, our findings reveal deeper, inherent conflicts within the models’ internal representations—conflicts that extend well beyond prompt-level contradictions (i.e., warped temporal representations of events before death as well).
elenal3ai.bsky.social
On the one hand, concept incongruence is very common. In role-play, “death” is our most vivid test case, yet the same tension appears any time a model must juggle incompatible constraints. This also shows up in political discussions, alignment (helpful vs. harmless), and essentially everywhere.
elenal3ai.bsky.social
I am glad that you found our paper entertaining! This is a great point for my follow-up thread on the implications of concept incongruence. Our main goal is to raise awareness and provide clarity around concept incongruence.
lchoshen.bsky.social
Highly entertaining paper and writeup, but does it really matter? Is it important that models can't abstain on counterfactuals?
Or that the leak information?
elenal3ai.bsky.social
🚨 New paper alert 🚨

Ever asked an LLM-as-Marilyn Monroe who the US president was in 2000? 🤔 Should the LLM answer at all? We call these clashes Concept Incongruence. Read on! ⬇️

1/n 🧵
elenal3ai.bsky.social
i.e., this is how ChatGPT replied to the request: 'Can you draw a picture for quantum mechanics as presidents?'
elenal3ai.bsky.social
On the other hand, concept incongruence is a window to creativity, whether in art or in scientific discoveries. Combining seemingly conflicting concepts is critical to creating something genuinely new. Exploring behavior under concept incongruence may be key to unlocking unknown model capabilities!
elenal3ai.bsky.social
More broadly, clashing concepts embed contradictory signals into a model’s latent space. To address these collisions, LLM developers need to tackle the specification problem. We believe there are important open questions in identifying incongruence and improving the specification.
elenal3ai.bsky.social
In addition to abstention behavior, our findings reveal deeper, inherent conflicts within the models’ internal representations—conflicts that extend well beyond prompt-level contradictions (i.e., warped temporal representations of events before death as well).
elenal3ai.bsky.social
On the one hand, concept incongruence is very common. In role-play, “death” is our most vivid test case, yet the same tension appears any time a model must juggle incompatible constraints. This also shows up in political discussions, alignment (helpful vs. harmless), and essentially everywhere.
elenal3ai.bsky.social
I am glad that you found our paper entertaining! This is a great point for my follow-up thread on the implications of concept incongruence. Our main goal is to raise awareness and provide clarity around concept incongruence.
elenal3ai.bsky.social
Shout out to my undergrad collaborators Ike Peng and Aditya Singh, and many thanks to my advisor @chenhaotan.bsky.social ! Special thanks to @ari-holtzman.bsky.social
elenal3ai.bsky.social
Unlike hallucinations, concept incongruence highlights structural problems about specification and thus provides opportunities for progress. Our work represents a first step toward formally defining, analyzing, and managing behaviors Dive in: 👉 arxiv.org/abs/2505.14905

7/n🧵
Concept Incongruence: An Exploration of Time and Death in Role Playing
Consider this prompt "Draw a unicorn with two horns". Should large language models (LLMs) recognize that a unicorn has only one horn by definition and ask users for clarifications, or proceed to gener...
arxiv.org