Nick Tomlin
@nickatomlin.bsky.social
1.7K followers 110 following 9 posts
Incoming assistant professor at TTIC, and current PhD student at Berkeley. Natural language processing. He/him. 🌐 eecs.berkeley.edu/~nicholas_tomlin/
Posts Media Videos Starter Packs
Pinned
nickatomlin.bsky.social
Writing my first post here to announce that I've accepted an assistant professor job at TTIC! I'll be starting in Fall 2026, and recruiting students this upcoming cycle.

Until then, I'll be wrapping up the PhD at Berkeley, and this summer I'll join NYU as a CDS Faculty Fellow 🏙️
Reposted by Nick Tomlin
ari-holtzman.bsky.social
FYI that UChicago CS & Stats is hiring at all levels via the Data Science Institue:

Postdoc: uchicago.infoready4.com#freeformComp...
Assistant Professor: apply.interfolio.com/174766
Associate Professor: apply.interfolio.com/174768
nickatomlin.bsky.social
What does it take to build a human-like user simulator? //

Jessy Lin and I wrote another blogpost on user simulators as a reward function for training interactive models, this time focused on methods + open questions:
jessylin.com/2025/09/25/u...
What does it take to build a human-like user simulator?
jessylin.com
Reposted by Nick Tomlin
eugenevinitsky.bsky.social
Was talking to a student who wasn't sure about why one would get a PhD. So I wrote up a list of reasons!
www.eugenevinitsky.com/posts/reason...
Eugene Vinitsky
www.eugenevinitsky.com
Reposted by Nick Tomlin
eugenevinitsky.bsky.social
An excellent blog post about a still huge missing gap, models of humans you can actually use to study human-AI interaction: jessylin.com/2025/07/10/u...
User simulators bridge RL with real-world interaction
jessylin.com
Reposted by Nick Tomlin
tticconnect.bsky.social
We’re proud to announce three new tenure-track assistant professors joining TTIC in Fall 2026: Yossi Gandelsman, Will Merrill, and Nick Tomlin (@nickatomlin.bsky.social). Meet them here: buff.ly/JH1DFtT
nickatomlin.bsky.social
🤠🤓🙂
rdhawkins.bsky.social
Happy to announce the first workshop on Pragmatic Reasoning in Language Models — PragLM @ COLM 2025! 🎉
How do LLMs engage in pragmatic reasoning, and what core pragmatic capacities remain beyond their reach?
🌐 sites.google.com/berkeley.edu/praglm/
📅 Submit by June 23rd
PragLM @ COLM '25
IMPORTANT DATES
sites.google.com
nickatomlin.bsky.social
Haha main reason for using Gym was that we wanted a way to automatically evaluate models against trained RL agents. Doing the full arena-style evaluation on reasoning models gets really expensive

It also helps that current LLMs are really good at generating functional Gym code
nickatomlin.bsky.social
I think in the short term that’s reasonable, e.g., current models can play chess but they definitely can’t understand chess variants

In the long term, I suspect there’s more risk of over-optimizing to those specific games, so the hope is that our approach is a bit more future-proof
nickatomlin.bsky.social
This is a difficult benchmark: the best non-reasoning LLMs score around 9%, while the best reasoning models score around 36%. In the future, as models get stronger, we anticipate that they'll also be able to generate harder games
Results table. The best model (o1) wins about 36% of games against the RL baselines.
nickatomlin.bsky.social
We use o1 to generate natural language rulebooks for 1000 two-player games and then implement these games as Gym environments. For each game, we train baseline agents in self-play with RL and then evaluate whether LLMs can beat the RL baselines
Main paper figure showing a three-step pipeline of game description generation, implementation generation, and self-play training of RL agents
nickatomlin.bsky.social
I'm particularly fond of this new benchmark paper we wrote, which aims to scalably evaluate whether language models can generalize to arbitrary new tasks. The core idea is to use LLMs to generate new games, and then evaluate whether LLMs can play those games

📄: arxiv.org/abs/2505.07215
Title and abstract of the paper, "Measuring General Intelligence with Generated Games"
Reposted by Nick Tomlin
kmahowald.bsky.social
I might be able to hire a postdoc for this fall in computational linguistics at UT Austin. Topics in the general LLM + cognitive space (particularly reasoning, chain of thought, LLMs + code) and LLM + linguistic space. If this could be of interest, feel free to get in touch!
nickatomlin.bsky.social
Writing my first post here to announce that I've accepted an assistant professor job at TTIC! I'll be starting in Fall 2026, and recruiting students this upcoming cycle.

Until then, I'll be wrapping up the PhD at Berkeley, and this summer I'll join NYU as a CDS Faculty Fellow 🏙️