Sarvesh Patil
@nagababa.bsky.social
240 followers 290 following 12 posts
Your friendly neighborhood roboticist! PhD student @cmurobotics.bsky.social Interested in Dexterous Manipulation, Democratization of Robots and Sensors, Sample Efficient RL, Soft Robotics, Causality, Multi-Agent Systems. servo97.github.io
Posts Media Videos Starter Packs
nagababa.bsky.social
If you are not familiar of the robot-learning/RL papers, and if you have the bandwidth to go through to them, I'm more than happy to share my source papers! Please let me know either ways!
nagababa.bsky.social
And that some intervention is still needed for robot learning pipelines to demonstrate respectable ICL?

The 8th thread graph makes me very curious, and I'd love to hear your thoughts on this phenomenon!
nagababa.bsky.social
i.e. converting robot joints into discrete cosine transforms seems to significantly improve sample efficiency and generalizability.

Why do we see this happen? Is that merely a ephemeral local minima that were stuck into?
nagababa.bsky.social
However when SOTA papers use transformers for learning policies and Q functions in RL, the observation seems to be that "Distributional RL works better for offline RL tasks", or Frequency action space tokenization.
nagababa.bsky.social
Hi Daniel, great post!
I'm curious what do you think about continuous control use-cases? From my very quick read over the post (not the paper itself) it seems that ICL emerges as a property of the model being able to handle diversity of continuous variable regression.

1/n
Reposted by Sarvesh Patil
gokul.dev
It was a dream come true to teach the course I wish existed at the start of my PhD. We built up the algorithmic foundations of modern-day RL, imitation learning, and RLHF, going deeper than the usual "grab bag of tricks". All 25 lectures + 150 pages of notes are now public!
Reposted by Sarvesh Patil
cjdamato.bsky.social
I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.

arxiv.org/abs/2405.06161
A First Introduction to Cooperative Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized ...
arxiv.org
nagababa.bsky.social
Store it on an old school magnetic HDD!
You only gotta power it on every decade or so to maintain the hardware.
nagababa.bsky.social
RL as a refinement tool has been used in dexterous manipulation for some time!
It used to be quite hard to do tabula rasa learning for dexterous manipulation. And still is, for the most part!
Reposted by Sarvesh Patil
pcastr.bsky.social
The are lots of people who've influenced AI but haven't won Nobel prizes.
I discuss a tiny sliver of them in this parody of @billyjoelofficial.bsky.social 's "We didn't start the fire"...
Enjoy!

youtube.com/shorts/qDSYA...
We Didn't Win A Nobel (Billy Joel Parody)
YouTube video by MUSICODE
youtube.com
nagababa.bsky.social
Hard agree. Although how to reconcile that with when you're writing a paper?

Like what do you think is a reasonable process to pick "baselines"?

I've seen some students get jaded cos reviewers ask them to incl "baselines" with shitty git implementations :(
Reposted by Sarvesh Patil
skiandsolve.bsky.social
If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.
nagababa.bsky.social
Hi! Can you please add me to the list?
Thanks for making it!
Reposted by Sarvesh Patil
ruiqigao.bsky.social
A common question nowadays: Which is better, diffusion or flow matching? 🤔

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
Reposted by Sarvesh Patil
mathurinmassias.bsky.social
Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!

We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423
nagababa.bsky.social
Intro Post
Hello World!
I'm a 2nd year Robotics PhD student at CMU, working on distributed dexterous manipulation, accessible soft robots and sensors, sample efficient robot learning, and causal inference.

Here are my cute robots:
PS: Videos are old and sped up. They move slower in real-world :3