Brenden Lake
@brendenlake.bsky.social
510 followers 150 following 12 posts
Incoming Associate Professor of Computer Science and Psychology @ Princeton. Posts are my views only. https://cims.nyu.edu/~brenden/
Posts Media Videos Starter Packs
brendenlake.bsky.social
I am also trying something new: posting our current and future directions directly on the lab website. Interested in joining us or collaborating? Get in touch! (2/2) lake-lab.github.io/apply/
brendenlake.bsky.social
Our new lab for Human & Machine Intelligence is officially open at Princeton University!

Consider applying for a PhD or Postdoc position, either through Computer Science or Psychology. You can register interest on our new website lake-lab.github.io (1/2)
brendenlake.bsky.social
Getting the lab + alums together at CogSci!
brendenlake.bsky.social
Some exciting Princeton Initiatives:

Natural and Artificial Minds
nam.ai.princeton.edu

Princeton AI Lab
ai.princeton.edu/ai-lab

Princeton Language and Intelligence
pli.princeton.edu
brendenlake.bsky.social
It's hard to leave NYU. I'll miss my incredible colleagues and the community that's meant so much over the past 8 years. NYU has become the largest hub for computational cognitive science that I know — it's been a joy and a privilege to be part of that. Thankfully, Princeton isn't too far.
brendenlake.bsky.social
I'm joining Princeton University as an Associate Professor of Computer Science and Psychology this fall! Princeton is ambitiously investing in AI and Natural & Artificial Minds, and I'm excited for my lab to contribute. Recruiting postdocs and Ph.D. students in CS and Psychology — join us!
Nassau Hall. Photo credit to Debbie and John O'Boyle
Reposted by Brenden Lake
guydav.bsky.social
Fantastic new work by @johnchen6.bsky.social (with @brendenlake.bsky.social and me trying not to cause too much trouble).

We study systematic generalization in a safety setting and find LLMs struggle to consistently respond safely when we vary how we ask naive questions. More analyses in the paper!
johnchen6.bsky.social
Do LLMs show systematic generalization of safety facts to novel scenarios?

Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this!

- Claude-3.7-Sonnet passes only 57% of facts evaluated
- o1 and o3-mini passed <45%! 🧵
brendenlake.bsky.social
Failures of systematic generalization in LLMs can lead to real-world safety issues.

New paper by @johnchen6.bsky.social and @guydav.bsky.social, arxiv.org/abs/2505.21828
johnchen6.bsky.social
Do LLMs show systematic generalization of safety facts to novel scenarios?

Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this!

- Claude-3.7-Sonnet passes only 57% of facts evaluated
- o1 and o3-mini passed <45%! 🧵
brendenlake.bsky.social
Before LLMs, neural nets were task-specific (while humans were task-general). Shockingly, LLMs changed that. How do LLMs represent a task, and do different prompts lead to the same task rep.? Love this by @guydav.bsky.social, and the function vectors of @ericwtodd.bsky.social @davidbau.bsky.social
guydav.bsky.social
New preprint alert! We often prompt ICL tasks using either demonstrations or instructions. How much does the form of the prompt matter to the task representation formed by a language model? Stick around to find out 1/N
Reposted by Brenden Lake
markkho.bsky.social
🤔 Interested in models of social interaction and computational psychiatry?

🤗 If so, @shawnrhoadsphd.bsky.social and I are seeking a highly motivated and talented postdoc to work on these topics!

Please share widely!

apply.interfolio.com/165809
We are hiring! Interested in computational models of social interaction and computational psychiatry?
Reposted by Brenden Lake
fredcallaway.bsky.social
Despite the world being on fire, I can't help but be thrilled to announce that I'll be starting as an Assistant Professor in the Cognitive Science Program at Dartmouth in Fall '26. I'll be recruiting grad students this upcoming cycle—get in touch if you're interested!
brendenlake.bsky.social
Amazing, congratulations Fred!!
brendenlake.bsky.social
I snuck a moment with my son Logan (2.5), ever the creative goal generator, into Fig. 1: "Papa, I made a Truck Carrier Truck!"
How do people compose existing concepts to create new goals? Can models generate and understand goals too?
nature.com/articles/s4225
brendenlake.bsky.social
@solimlegris.bsky.social and Wai Keen Vong estimated that average human performance on ARC is about 64%(public eval set). Thus, o3 is clearly better than the average crowd worker tested. Note that almost all tasks were solvable by at least one person who tried it on MTurk. arxiv.org/abs/2409.01374
brendenlake.bsky.social
I'm new here. I heard bluesky is like science Twitter back in the day, and there are fewer posts from Elon Musk. Did I come to the right place?