Anthony GX-Chen
@agx-chen.bsky.social
24 followers 20 following 7 posts
PhD student at NYU CILVR. Prev: Master's at McGill / Mila. || RL, ML, Neuroscience. https://im-ant.github.io/
Posts Media Videos Starter Packs
agx-chen.bsky.social
This work has been accepted to #COLM2025. If you are in Montreal this week for COLM and would like to chat about this (or anything related to discovery / exploration / RL), drop me a note!

Poster session 2: Tuesday Oct 7, 4:30-6:30pm
Poster number 68
agx-chen.bsky.social
Language model (LM) agents are all the rage now—but they may exhibit cognitive biases when inferring causal relationships!

We evaluate LMs on a cognitive task to find:
- LMs struggle with certain simple causal relationships
- They show biases similar to human adults (but not children)

🧵⬇️
Example of the Blicket Test experiment. A subset of objects activate the machine following an unobserved rule ("disjunctive" / "conjunctive"). The agent needs to interact with the environment by placing objects on/off the machine to figure out the rule.
agx-chen.bsky.social
Thank you for the shoutout Alison! We actually just arXiv-ed the paper. Attaching the thread below :)

bsky.app/profile/agx-...
agx-chen.bsky.social
Language model (LM) agents are all the rage now—but they may exhibit cognitive biases when inferring causal relationships!

We evaluate LMs on a cognitive task to find:
- LMs struggle with certain simple causal relationships
- They show biases similar to human adults (but not children)

🧵⬇️
Example of the Blicket Test experiment. A subset of objects activate the machine following an unobserved rule ("disjunctive" / "conjunctive"). The agent needs to interact with the environment by placing objects on/off the machine to figure out the rule.
agx-chen.bsky.social
How can we help LMs think more rigorously, like scientists?

We fix this “biased prior” by explicitly sampling a higher-entropy hypothesis distribution, then prompting the LM to maximize info gain under the new distribution. This significantly improves exploration and inference performance!
Agent samples (without replacement) from the LM prior at inference time to construct a new prior with higher entropy. It then iteratively prompts the LM to take actions that maximize information gain under the new distribution, and eliminate hypotheses inconsistent with new observations. Hypothesis sampling allows the agent to correct its “disjunctive bias”, and perform equally well on both disjunctive and conjunctive environments when sufficiently many hypotheses are sampled.
agx-chen.bsky.social
Why do LMs have this “cognitive bias”? We compare the LMs’ behaviour to human data, and find that most LMs behave like adults, and less like children who are more receptive to alternative hypotheses. This may suggest that LM trained on adult generated data inherits the same human irrationalities.
(Left) When presented with conjunctive evidence, most LMs tend to prefer disjunctive inferences, similar to human adults. (Right) LM exploration behaviour appears more affected by the underlying causal rule, while children's behaviour do not appear to show significant differences.
agx-chen.bsky.social
We evaluate LMs on the classic “Blicket Test” pioneered by @alisongopnik.bsky.social . The goal: assess their abilities to discover and infer causal relationships.

Across a range of models, LMs consistently struggle more with the “conjunctive” (AND) rule, but not the “disjunctive” (OR) rule.
Causal exploration efficiency of different LMs, measured by number of hypotheses remaining after each step in the environment. Lower y axis means agent generated observations that eliminate more hypotheses. The goal is to eliminate all but one hypothesis (which is the true causal relationship).
agx-chen.bsky.social
Language model (LM) agents are all the rage now—but they may exhibit cognitive biases when inferring causal relationships!

We evaluate LMs on a cognitive task to find:
- LMs struggle with certain simple causal relationships
- They show biases similar to human adults (but not children)

🧵⬇️
Example of the Blicket Test experiment. A subset of objects activate the machine following an unobserved rule ("disjunctive" / "conjunctive"). The agent needs to interact with the environment by placing objects on/off the machine to figure out the rule.
Reposted by Anthony GX-Chen
alisongopnik.bsky.social
Fascinating preprint from with our "blicket detector" paradigm from Chen et al at NYU& Mila. LLM's make the same causal inference mistakes that adults make but 4 year olds don't! Of course, models are trained on adult data, kids figure it out for themselves.
im-ant.github.io/publications...
im-ant.github.io