Jess Hamrick
@jhamrick.bsky.social
5.8K followers 1.6K following 210 posts
Researching planning, reasoning, and RL in LLMs @ Reflection AI. Previously: Google DeepMind, UC Berkeley, MIT. I post about: AI 🤖, flowers 🌷, parenting 👶, public transit 🚆. She/her. http://www.jesshamrick.com
Posts Media Videos Starter Packs
jhamrick.bsky.social
Also, some people don't have mental imagery at all (aphantasia)! My conclusion based on the evidence is that it's we do some form of latent and/or piecemeal simulation but it's definitely not pixel perfect.
Reposted by Jess Hamrick
kjha02.bsky.social
Forget modeling every belief and goal! What if we represented people as following simple scripts instead (i.e "cross the crosswalk")?

Our new paper shows AI which models others’ minds as Python code 💻 can quickly and accurately predict human behavior!

shorturl.at/siUYI%F0%9F%...
jhamrick.bsky.social
This is so so cool! I tried to build an AI system to do a variation of the Finke task waaay back in... 2016 or 2017? It didn't work very well, hah. (It was a combination of Bayesian inference over the structured representation with a CNN recognition model). Amazing that LLMs are able to do this.
Reposted by Jess Hamrick
jorge-morales.bsky.social
Imagine an apple 🍎. Is your mental image more like a picture or more like a thought? In a new preprint led by Morgan McCarty—our lab's wonderful RA—we develop a new approach to this old cognitive science question and find that LLMs excel at tasks thought to be solvable only via visual imagery. 🧵
Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models
This study offers a novel approach for benchmarking complex cognitive behavior in artificial systems. Almost universally, Large Language Models (LLMs) perform best on tasks which may be included in th...
arxiv.org
jhamrick.bsky.social
I think your links got messed up, the paper is here: github.com/NVlabs/RLP/b...
Reposted by Jess Hamrick
sungkim.bsky.social
Nvidia's RLP: Reinforcement Learning Pretraining—information-driven, verifier-free objective that teaches models to think before they predict

🔥+19% vs BASE on Qwen3-1.7B
🚀+35% vs BASE on Nemotron-Nano-12B
Reposted by Jess Hamrick
tedunderwood.com
Resharing this, because it's proving valuable enough that I spent 10 minutes looking it up. TLDR: It's true that some famous recent papers in AI were produced in the private sector. But they *cite* lots of papers with academic authors and federal funding.
markriedl.bsky.social
Continuing to fiddle. I've discovered that data visualization is extremely susceptible to "procrastiwork". I should just stop and release my notebook.
Reposted by Jess Hamrick
natolambert.bsky.social
Nice to see another fully open, multimodal LM released! Good license, training code, pretraining data, all here.
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Slowly, the community is growing.
arxiv.org/abs/2509.236...
Reposted by Jess Hamrick
ozlandclone.bsky.social
It's aster time! I have never seen so many monarchs, bumble bees and other pollinators in my yard. I had 6 monarchs on my New England aster at one time! Maybe they are on their migration. Anyway, shows how important late season natives are. 🌱 #nativeplants, #pollinators
Reposted by Jess Hamrick
dorialexander.bsky.social
I know there is a lot of competition today, but this might be most consequential release for people training models: in-depth exploration of full-finetuning, lora, RL efficiencies by John Schulman (ThinkingMachines). thinkingmachines.ai/blog/lora/
Reposted by Jess Hamrick
greenparty.org.uk
🎉 80,000 Green Party members!

📈 But we're not stopping there.

💚 We have no time to waste. Join the Green Party today ⤵️
Reposted by Jess Hamrick
gershbrain.bsky.social
I was part of an interesting panel discussion yesterday at an ARC event. Maybe everybody knows this already, but I was quite surprised by how "general" intelligence was conceptualized in relation to human intelligence and the ARC benchmarks.
Reposted by Jess Hamrick
alonsosilva.bsky.social
Want to visualize the response format constraints on the LLM when working in a Jupyter notebook?
Then you might be interested in my new project `litelines`.
Litelines lets you visualize the selected path by the LLM.
It supports a Pydantic schema as a response format, as well as regular expressions.
Reposted by Jess Hamrick
timkellogg.me
sheesh! AI bluesky has arrived

not just good content, there’s more and more original work, people from labs, and people with genuinely interesting perspectives

when i joined, it was so painful trying to find even traces
Reposted by Jess Hamrick
narphorium.com
Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
arxiv.org/abs/2509.13351
jhamrick.bsky.social
I was wondering about that too...
Reposted by Jess Hamrick
informor.bsky.social
For instructors out there: a very cool set of AI Course Policy Icons by Cornell's GenAI taskforce, to be used in combination for your syllabi or assignments.

Inspired by @creativecommons.bsky.social and available w/ a CC license. Sample icons attached here.

teaching.cornell.edu/generative-a...
ANY-AI icon: (Any Tool, Any Use, Any Time) to use in policy of GenAI use in a course AT (Approved Tools Only) icon to use in policy of GenAI use in a course UA (Use with Attribution) icon to use in policy of GenAI use in a course AS (Assignment-Specific) icon to use in policy of GenAI use in a course
jhamrick.bsky.social
For others like me who might be unsure what this is about www.eu-inc.org has some further details
Why the EU–INC?

Europe has the talent, ambition, and ecosystems to create innovative companies, but fragmentation between European nations is holding us back.

"A startup from California can expand and raise money all across the United States. But our companies still face way too many national barriers that make it hard to work Europa-wide, and way too much regulatory burden."

– Ursula von der Leyen, Oct 2024
Reposted by Jess Hamrick
thomwolf.bsky.social
EU–INC is the single best thing Europe could do to catch-up in the AI race

A simple unified pan-European startup structure, with modern employee ownership and simple access to capital, able to tap into Europe’s full talent pool.

‼️ but it’s at high risk of not seeing the light of day. You can help👇
Reposted by Jess Hamrick
dorialexander.bsky.social
And new paper out: Pleias 1.0: the First Family of Language Models Trained on Fully Open Data

How we train an open everything model on a new pretraining environment with releasable data (Common Corpus) with an open source framework (Nanotron from HuggingFace).

www.sciencedirect.com/science/arti...
Reposted by Jess Hamrick
smcgrath.phd
This is a really good thread about forecasting too far into the future for medical AI, and what the “we should stop training doctors/lawyers” crowd is missing.

Tagging for #MedSky #MLSky
deenamousa.com
In 2016 Geoffrey Hinton said “we should stop training radiologists now" since AI would soon be better at their jobs.

He was right: models have outperformed radiologists on benchmarks for ~a decade.

Yet radiology jobs are at record highs, with an average salary of $520k.

Why?