Nico Bohlinger
@nicobohlinger.bsky.social
35 followers 36 following 21 posts
26 | Morphology-aware Robotics, RL Research | PhD student at @ias-tudarmstadt.bsky.social
Posts Media Videos Starter Packs
nicobohlinger.bsky.social
If you are interested in massive multi-embodiment learning, come and chat with me at:
- Today, WS Sim-to-Real Transfer for Humanoid Robots
at Humanoids2025
- Oct 20th, WS Foundation Models for Robotic Design
at IROS2025
- Oct 24th, WS Reconfigurable Modular Robots
at IROS2025
nicobohlinger.bsky.social
⚡️ Can one unified policy control 10 million different robots and zero-shot transfer to completely unseen robots, even humanoids?

🔗 Yes! Checkout our paper: arxiv.org/abs/2509.02815
nicobohlinger.bsky.social
👏 Huge thanks to everyone involved:
Bo Ai, Liu Dai, Dichen Li, Tongzhou Mu, Zhanxin Wu, K. Fay, Henrik I. Christensen, @jan-peters.bsky.social and Hao Su
nicobohlinger.bsky.social
🇰🇷 Conferences are about finally meeting your collaborators from all around the world!

Check out our work on Embodiment Scaling Laws @CoRL2025
We investigate cross-embodiment learning as the next axis of scaling for truly generalist policies 📈

🔗 All details: embodiment-scaling-laws.github.io
Reposted by Nico Bohlinger
nicobohlinger.bsky.social
Or come to my talk @ International Symposium on Adaptive Motion of Animals and Machines and LokoAssist Symposium (AMAM) on Friday at TU Darmstadt

Thanks to @ias-tudarmstadt.bsky.social, @jan-peters.bsky.social
nicobohlinger.bsky.social
If you want to know how to create a neural network architecture to train one policy to control any robot embodiment, check out: nico-bohlinger.github.io/one_policy_t...
One Policy to Run Them All
nico-bohlinger.github.io
nicobohlinger.bsky.social
Robot Randomization is fun!
nicobohlinger.bsky.social
⚙️ Architecture matters
We also explored architectures like Universal Neural Functionals (UNF) and action-based representations ("Probing").
And yes, our scaled EPVFs are competitive with PPO and SAC in their final performance.
nicobohlinger.bsky.social
⚡ But it's not just about size
Key ingredients for stability and performance are weight clipping and using uniform noise scaled to the parameter magnitudes.
Our ablation studies show just how critical these components are. Without them, performance collapses.
nicobohlinger.bsky.social
📈 Massive Scaling Pays Off
We see strong scaling effects when using MJX to rollout up to 4000 differently perturbed policies in parallel.
This explores the policy space effectively and large batches drastically reduce the variance of the resulting gradients.
nicobohlinger.bsky.social
🧠 Simple & Powerful RL
This unlocks fully off-policy learning and policy parameter space exploration using any policy data and leads to the probably most simple DRL algorithm one can imagine:
nicobohlinger.bsky.social
🔍 What are EPVFs?
Imagine a value function that understands the policy's parameters directly: V(θ).
This allows for direct, gradient-based policy updates:
nicobohlinger.bsky.social
🚀 Checkout our new work at @rldmdublin2025.bsky.social today at poster#16!
We're showing how to make Explicit Policy-conditioned Value Functions V(θ) (originating from Faccio & Schmidhuber) work for more complex control tasks. The secret? Massive scaling!
Reposted by Nico Bohlinger
ias-tudarmstadt.bsky.social
IAS is at RLDM 2025! We have many exiting works to share (see 👇), so come to our posters and talk to us!
nicobohlinger.bsky.social
Many thanks to my colleagues and collaborators: Daniel Palenicek, Łukasz Antczak, @jan-peters.bsky.social and most importantly Jonathan Kinzel (@ibims1jfk.bsky.social), who interned at MAB Robotics and did the experiments.
Also thanks to MAB Robotics for providing the hardware and constant support!
nicobohlinger.bsky.social
We build on the efficient CrossQ DRL algorithm and combine it with two control architectures — Joint Target Prediction for agile maneuvers and Central Pattern Generators for stable, natural gaits — to train locomotion policies directly on the HoneyBadger quadruped robot from MAB Robotics.
nicobohlinger.bsky.social
⚡️ Do you think training robot locomotion needs large scale simulation? Think again!

We train an omnidirectional locomotion policy directly on a real quadruped in just a few minutes 🚀
Top speeds of 0.85 m/s, two different control approaches, indoor and outdoor experiments, and more! 🤖🏃‍♂️
nicobohlinger.bsky.social
Great investigation! I hope that we can finally bridge the gap and combine the algorithmic research / advances in off-policy RL and applied robot learning with scaling on-policy RL