Tom Silver
@tomssilver.bsky.social
340 followers 78 following 58 posts
Assistant Professor @Princeton. Developing robots that plan and learn to help people. Prev: @Cornell, @MIT, @Harvard. https://tomsilver.github.io/
Posts Media Videos Starter Packs
tomssilver.bsky.social
This week's #PaperILike is "Predictive Representations of State" (Littman et al., 2001).

A lesser known classic that is overdue for a revival. Fans of POMDPs will enjoy.

PDF: web.eecs.umich.edu/~baveja/Pape...
web.eecs.umich.edu
tomssilver.bsky.social
This week's #PaperILike is "Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems" (Suau et al., ICML 2022).

Nice work on using fast local simulators to plan & learn in large partially observed worlds.

PDF: arxiv.org/abs/2202.01534
Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems
Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which t...
arxiv.org
tomssilver.bsky.social
I’m doing this in my course right now. So far so good! One finding: if students try to install and uv while already being in a conda env, bad things happen. Make sure to deactivate conda first.
tomssilver.bsky.social
This week's #PaperILike is "Optimal Interactive Learning on the Job via Facility Location Planning" (Vats et al., RSS 2025).

I always enjoy a surprising connection between one problem (COIL) and another (UFL). And I always like work by Shivam Vats!

PDF: arxiv.org/abs/2505.00490
Optimal Interactive Learning on the Job via Facility Location Planning
Collaborative robots must continually adapt to novel tasks and user preferences without overburdening the user. While prior interactive robot learning methods aim to reduce human effort, they are typi...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models" (Shi et al., ICML 2025).

I'm often asked: how might we combine ideas from hierarchical planning and VLAs? This is a good start!

PDF: arxiv.org/abs/2502.19417
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models
Generalist robots that can perform a range of different tasks in open-world settings must be able to not only reason about the steps needed to accomplish their goals, but also process complex instruct...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming" (Bonet & Geffner, 2003).

A very clear introduction to and improvement of RTDP, an online MDP planner that we should all have in our toolkits.

PDF: ftp.cs.ucla.edu/pub/stat_ser...
ftp.cs.ucla.edu
tomssilver.bsky.social
This week's #PaperILike is "Learning and Executing Generalized Robot Plans" (Fikes et al., 1972).

Classic early work on learning & planning from the team behind STRIPS, A* search, and Shakey the robot (www.youtube.com/watch?v=GmU7...).

PDF: stacks.stanford.edu/file/druid:c...
Shakey: Experiments in Robot Planning and Learning (1972)
YouTube video by Stanford University Libraries
www.youtube.com
tomssilver.bsky.social
This week's #PaperILike is "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing" (Xue et al., 2024).

My favorite part is the clear running example in 2D (Fig 2 & 4). I want examples like this in my papers!

PDF: arxiv.org/abs/2409.15610
Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing
Due to high dimensionality and non-convexity, real-time optimal control using full-order dynamics models for legged robots is challenging. Therefore, Nonlinear Model Predictive Control (NMPC) approach...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Abstraction Refinement-guided Program Synthesis for Robot Learning from Demonstrations" (Cui et al., 2025).

And other recent papers by the same group---exciting progress in programmatic RL with applications to robotics.

PDF: herowanzhu.github.io/roboscribe.pdf
herowanzhu.github.io
tomssilver.bsky.social
This week's #PaperILike is "Learning Value Functions with Relational State Representations for Guiding Task-and-Motion Planning" (Kim & Shimanuki, CoRL 2019).

I especially like the focus on *representations* for supporting learning and planning.

PDF: proceedings.mlr.press/v100/kim20a/...
proceedings.mlr.press
tomssilver.bsky.social
This week's #PaperILike is "The Utility of Temporal Abstraction in Reinforcement Learning" (Jong et al., AAMAS 2008).

My favorite underrated paper in hierarchical RL. Unpacks how options can help *or hurt* learning performance. Fun writing.

PDF: www.ifaamas.org/Proceedings/...
www.ifaamas.org
tomssilver.bsky.social
This week's #PaperILike is "Width and Serialization of Classical Planning Problems" (Lipovetzky & Geffner, ECAI 2012).

If you only read a few classical planning papers, this should be one! Illuminating and practically useful.

PDF: www-i6.informatik.rwth-aachen.de/~hector.geff...
www-i6.informatik.rwth-aachen.de
tomssilver.bsky.social
This week's #PaperILike is "Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles" (Budd et al., AAAI 2024).

Metareasoning is increasingly important as we continue to make progress on "reasoning."

PDF: ojs.aaai.org/index.php/AA...
Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles | Proceedings of the AAAI Conference on Artificial Intelligence
ojs.aaai.org
tomssilver.bsky.social
This week's #PaperILike is "PushWorld: A benchmark for manipulation planning with tools and movable obstacles" (Kansky et al., 2023).

Fans of benchmarks like ARC will enjoy the simple mechanics and the difficult reasoning required.

PDF: arxiv.org/abs/2301.10289
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
While recent advances in artificial intelligence have achieved human-level performance in environments like Starcraft and Go, many physical reasoning tasks remain challenging for modern algorithms. To...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Effort Level Search in Infinite Completion Trees with Application to Task-and-Motion Planning" (Toussaint et al., ICRA 2024).

Addresses the meta-reasoning challenge that is core to TAMP. Toussaint is always worth a read.

PDF: www.user.tu-berlin.de/mtoussai/24-...
www.user.tu-berlin.de
tomssilver.bsky.social
This week's #PaperILike is "The Power of Resets in Online Reinforcement Learning" (Mhammedi et al., 2024).

If you're doing RL in sim, why not use the sim to its full potential? Reset to any state! (gym.Env.reset() is not all we need.)

PDF: arxiv.org/abs/2404.15417
The Power of Resets in Online Reinforcement Learning
Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general fun...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Learning over Subgoals for Efficient Navigation of Structured, Unknown Environments" (Stein et al., CoRL 2018).

A highly original combination of learning + planning that is still underrated (despite winning a CoRL award!)

PDF: proceedings.mlr.press/v87/stein18a...
proceedings.mlr.press
tomssilver.bsky.social
This week's #PaperILike is "Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly" (Hartmann et al., TRO 2022).

Take two minutes to watch this video: www.youtube.com/watch?v=Gqho...

I don't use a lot of emojis, but 🤯

PDF: arxiv.org/abs/2106.02489
Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly
YouTube video by Valentin Hartmann
www.youtube.com
tomssilver.bsky.social
This week's #PaperILike is "Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models" (Lamb et al., 2022).

Part of an exciting line of work: sites.google.com/view/agent-i...

This one has an awesome related work section.

PDF: arxiv.org/abs/2207.08229
tomssilver.bsky.social
This week's #PaperILike is "Grounding Language Plans in Demonstrations Through Counterfactual Perturbations" (Wang et al., ICLR 2024).

A very ideas-rich paper that combines mode-based planning, LLMs, abstraction, few-shot learning, and real robots!

PDF: arxiv.org/abs/2403.17124
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs dir...
arxiv.org
Reposted by Tom Silver
programsynthesis.bsky.social
🧵1/ New paper! 📄 InnateCoder: Learning Programmatic Options with Foundation Models

This is Rubens Moraes' final chapter of his PhD thesis from Universidade Federal de Viçosa, Brazil, in collaboration with Quazi Sadmine and Hendrik Baier.

arXiv: arxiv.org/abs/2505.12508
tomssilver.bsky.social
Did something today that I never expected to do: made a donation to Harvard!
tomssilver.bsky.social
This week's #PaperILike is "Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings" (Karia et al., ICAPS 2024).

A sophisticated approach to a hard & realistic problem. See also their other nice works on RMDPs.

PDF: arxiv.org/abs/2402.08145
Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings
This paper introduces a new approach for continual planning and model learning in relational, non-stationary stochastic environments. Such capabilities are essential for the deployment of sequential d...
arxiv.org
tomssilver.bsky.social
This week's #PaperILike is "Meta-Optimization and Program Search using Language Models for Task and Motion Planning" (Shcherba et al., 2025).

I don't often post such new papers, but I'm very excited to see more TAMP + LLM-based program synthesis.

PDF: arxiv.org/abs/2505.03725
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic...
arxiv.org