Lightnews — Scholar-powered news

Tom Silver @tomssilver.bsky.social · 3d

This week's #PaperILike is "Predictive Representations of State" (Littman et al., 2001).

A lesser known classic that is overdue for a revival. Fans of POMDPs will enjoy.

PDF: web.eecs.umich.edu/~baveja/Pape...

web.eecs.umich.edu

1 2

Tom Silver @tomssilver.bsky.social · 10d

This week's #PaperILike is "Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems" (Suau et al., ICML 2022).

Nice work on using fast local simulators to plan & learn in large partially observed worlds.

PDF: arxiv.org/abs/2202.01534

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which t...

arxiv.org

1

Tom Silver @tomssilver.bsky.social · 17d

I’m doing this in my course right now. So far so good! One finding: if students try to install and uv while already being in a conda env, bad things happen. Make sure to deactivate conda first.

1 3

Tom Silver @tomssilver.bsky.social · 17d

This week's #PaperILike is "Optimal Interactive Learning on the Job via Facility Location Planning" (Vats et al., RSS 2025).

I always enjoy a surprising connection between one problem (COIL) and another (UFL). And I always like work by Shivam Vats!

PDF: arxiv.org/abs/2505.00490

Optimal Interactive Learning on the Job via Facility Location Planning

Collaborative robots must continually adapt to novel tasks and user preferences without overburdening the user. While prior interactive robot learning methods aim to reduce human effort, they are typi...

arxiv.org

Tom Silver @tomssilver.bsky.social · 24d

This week's #PaperILike is "Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models" (Shi et al., ICML 2025).

I'm often asked: how might we combine ideas from hierarchical planning and VLAs? This is a good start!

PDF: arxiv.org/abs/2502.19417

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

Generalist robots that can perform a range of different tasks in open-world settings must be able to not only reason about the steps needed to accomplish their goals, but also process complex instruct...

arxiv.org

4 8

Tom Silver @tomssilver.bsky.social · Sep 7

This week's #PaperILike is "Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming" (Bonet & Geffner, 2003).

A very clear introduction to and improvement of RTDP, an online MDP planner that we should all have in our toolkits.

PDF: ftp.cs.ucla.edu/pub/stat_ser...

ftp.cs.ucla.edu

Tom Silver @tomssilver.bsky.social · Aug 31

This week's #PaperILike is "Learning and Executing Generalized Robot Plans" (Fikes et al., 1972).

Classic early work on learning & planning from the team behind STRIPS, A* search, and Shakey the robot (www.youtube.com/watch?v=GmU7...).

PDF: stacks.stanford.edu/file/druid:c...

Shakey: Experiments in Robot Planning and Learning (1972)

YouTube video by Stanford University Libraries

www.youtube.com

2

Tom Silver @tomssilver.bsky.social · Aug 24

This week's #PaperILike is "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing" (Xue et al., 2024).

My favorite part is the clear running example in 2D (Fig 2 & 4). I want examples like this in my papers!

PDF: arxiv.org/abs/2409.15610

Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing

Due to high dimensionality and non-convexity, real-time optimal control using full-order dynamics models for legged robots is challenging. Therefore, Nonlinear Model Predictive Control (NMPC) approach...

arxiv.org

3

Tom Silver @tomssilver.bsky.social · Aug 17

This week's #PaperILike is "Abstraction Refinement-guided Program Synthesis for Robot Learning from Demonstrations" (Cui et al., 2025).

And other recent papers by the same group---exciting progress in programmatic RL with applications to robotics.

PDF: herowanzhu.github.io/roboscribe.pdf

herowanzhu.github.io

2

Tom Silver @tomssilver.bsky.social · Aug 10

This week's #PaperILike is "Learning Value Functions with Relational State Representations for Guiding Task-and-Motion Planning" (Kim & Shimanuki, CoRL 2019).

I especially like the focus on *representations* for supporting learning and planning.

PDF: proceedings.mlr.press/v100/kim20a/...

proceedings.mlr.press

3

Tom Silver @tomssilver.bsky.social · Aug 3

This week's #PaperILike is "The Utility of Temporal Abstraction in Reinforcement Learning" (Jong et al., AAMAS 2008).

My favorite underrated paper in hierarchical RL. Unpacks how options can help *or hurt* learning performance. Fun writing.

PDF: www.ifaamas.org/Proceedings/...

www.ifaamas.org

2

Tom Silver @tomssilver.bsky.social · Jul 27

This week's #PaperILike is "Width and Serialization of Classical Planning Problems" (Lipovetzky & Geffner, ECAI 2012).

If you only read a few classical planning papers, this should be one! Illuminating and practically useful.

PDF: www-i6.informatik.rwth-aachen.de/~hector.geff...

www-i6.informatik.rwth-aachen.de

1 1

Tom Silver @tomssilver.bsky.social · Jul 20

This week's #PaperILike is "Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles" (Budd et al., AAAI 2024).

Metareasoning is increasingly important as we continue to make progress on "reasoning."

PDF: ojs.aaai.org/index.php/AA...

Stop! Planner Time: Metareasoning for Probabilistic Planning Using Learned Performance Profiles | Proceedings of the AAAI Conference on Artificial Intelligence

ojs.aaai.org

2

Tom Silver @tomssilver.bsky.social · Jul 13

This week's #PaperILike is "PushWorld: A benchmark for manipulation planning with tools and movable obstacles" (Kansky et al., 2023).

Fans of benchmarks like ARC will enjoy the simple mechanics and the difficult reasoning required.

PDF: arxiv.org/abs/2301.10289

PushWorld: A benchmark for manipulation planning with tools and movable obstacles

While recent advances in artificial intelligence have achieved human-level performance in environments like Starcraft and Go, many physical reasoning tasks remain challenging for modern algorithms. To...

arxiv.org

6

Tom Silver @tomssilver.bsky.social · Jul 6

This week's #PaperILike is "Effort Level Search in Infinite Completion Trees with Application to Task-and-Motion Planning" (Toussaint et al., ICRA 2024).

Addresses the meta-reasoning challenge that is core to TAMP. Toussaint is always worth a read.

PDF: www.user.tu-berlin.de/mtoussai/24-...

www.user.tu-berlin.de

1 3

Tom Silver @tomssilver.bsky.social · Jun 29

This week's #PaperILike is "The Power of Resets in Online Reinforcement Learning" (Mhammedi et al., 2024).

If you're doing RL in sim, why not use the sim to its full potential? Reset to any state! (gym.Env.reset() is not all we need.)

PDF: arxiv.org/abs/2404.15417

The Power of Resets in Online Reinforcement Learning

Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general fun...

arxiv.org

2 5

Tom Silver @tomssilver.bsky.social · Jun 22

This week's #PaperILike is "Learning over Subgoals for Efficient Navigation of Structured, Unknown Environments" (Stein et al., CoRL 2018).

A highly original combination of learning + planning that is still underrated (despite winning a CoRL award!)

PDF: proceedings.mlr.press/v87/stein18a...

proceedings.mlr.press

1 3

Tom Silver @tomssilver.bsky.social · Jun 15

This week's #PaperILike is "Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly" (Hartmann et al., TRO 2022).

Take two minutes to watch this video: www.youtube.com/watch?v=Gqho...

I don't use a lot of emojis, but 🤯

PDF: arxiv.org/abs/2106.02489

Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly

YouTube video by Valentin Hartmann

www.youtube.com

2 12

Tom Silver @tomssilver.bsky.social · Jun 8

This week's #PaperILike is "From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning" (Shah et al., 2024).

A fresh & clever approach with very impressive few-shot generalization results.

PDF: arxiv.org/abs/2402.11871

From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning

Humans efficiently generalize from limited demonstrations, but robots still struggle to transfer learned knowledge to complex, unseen tasks with longer horizons and increased complexity. We propose th...

arxiv.org

5

Tom Silver @tomssilver.bsky.social · Jun 1

This week's #PaperILike is "Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models" (Lamb et al., 2022).

Part of an exciting line of work: sites.google.com/view/agent-i...

This one has an awesome related work section.

PDF: arxiv.org/abs/2207.08229

3

Tom Silver @tomssilver.bsky.social · May 25

This week's #PaperILike is "Grounding Language Plans in Demonstrations Through Counterfactual Perturbations" (Wang et al., ICLR 2024).

A very ideas-rich paper that combines mode-based planning, LLMs, abstraction, few-shot learning, and real robots!

PDF: arxiv.org/abs/2403.17124

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs dir...

arxiv.org

Reposted by Tom Silver

Levi Lelis @programsynthesis.bsky.social · May 23

🧵1/ New paper! 📄 InnateCoder: Learning Programmatic Options with Foundation Models

This is Rubens Moraes' final chapter of his PhD thesis from Universidade Federal de Viçosa, Brazil, in collaboration with Quazi Sadmine and Hendrik Baier.

arXiv: arxiv.org/abs/2505.12508

1 3 5

Tom Silver @tomssilver.bsky.social · May 23

Did something today that I never expected to do: made a donation to Harvard!

1

Tom Silver @tomssilver.bsky.social · May 18

This week's #PaperILike is "Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings" (Karia et al., ICAPS 2024).

A sophisticated approach to a hard & realistic problem. See also their other nice works on RMDPs.

PDF: arxiv.org/abs/2402.08145

Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings

This paper introduces a new approach for continual planning and model learning in relational, non-stationary stochastic environments. Such capabilities are essential for the deployment of sequential d...

arxiv.org

2

Tom Silver @tomssilver.bsky.social · May 11

This week's #PaperILike is "Meta-Optimization and Program Search using Language Models for Task and Motion Planning" (Shcherba et al., 2025).

I don't often post such new papers, but I'm very excited to see more TAMP + LLM-based program synthesis.

PDF: arxiv.org/abs/2505.03725

Meta-Optimization and Program Search using Language Models for Task and Motion Planning

Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic...

arxiv.org

4