Lightnews — Scholar-powered news

LightNews

Sarvesh Patil

@nagababa.bsky.social

240 followers 290 following 12 posts

Your friendly neighborhood roboticist! PhD student @cmurobotics.bsky.social Interested in Dexterous Manipulation, Democratization of Robots and Sensors, Sample Efficient RL, Soft Robotics, Causality, Multi-Agent Systems. servo97.github.io

Posts Media Videos Starter Packs

Sarvesh Patil @nagababa.bsky.social · Jun 28

If you are not familiar of the robot-learning/RL papers, and if you have the bandwidth to go through to them, I'm more than happy to share my source papers! Please let me know either ways!

Sarvesh Patil @nagababa.bsky.social · Jun 28

And that some intervention is still needed for robot learning pipelines to demonstrate respectable ICL?

The 8th thread graph makes me very curious, and I'd love to hear your thoughts on this phenomenon!

Sarvesh Patil @nagababa.bsky.social · Jun 28

i.e. converting robot joints into discrete cosine transforms seems to significantly improve sample efficiency and generalizability.

Why do we see this happen? Is that merely a ephemeral local minima that were stuck into?

Sarvesh Patil @nagababa.bsky.social · Jun 28

However when SOTA papers use transformers for learning policies and Q functions in RL, the observation seems to be that "Distributional RL works better for offline RL tasks", or Frequency action space tokenization.

Sarvesh Patil @nagababa.bsky.social · Jun 28

Hi Daniel, great post!
I'm curious what do you think about continuous control use-cases? From my very quick read over the post (not the paper itself) it seems that ICL emerges as a property of the model being able to handle diversity of continuous variable regression.

1/n

Reposted by Sarvesh Patil

Gokul Swamy @gokul.dev · Jun 20

It was a dream come true to teach the course I wish existed at the start of my PhD. We built up the algorithmic foundations of modern-day RL, imitation learning, and RLHF, going deeper than the usual "grab bag of tricks". All 25 lectures + 150 pages of notes are now public!

Sarvesh Patil @nagababa.bsky.social · Apr 28

I like scholar-inbox
www.scholar-inbox.com

Scholar Inbox is a personal paper recommender which enables researchers to stay up-to-date with the most relevant progress in their field based on their personal research interests. Scholar Inbox is f...

www.scholar-inbox.com

Reposted by Sarvesh Patil

Chris Amato @cjdamato.bsky.social · Jan 7

I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.

arxiv.org/abs/2405.06161

A First Introduction to Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized ...

Sarvesh Patil @nagababa.bsky.social · Jan 5

Store it on an old school magnetic HDD!
You only gotta power it on every decade or so to maintain the hardware.

Sarvesh Patil @nagababa.bsky.social · Dec 28

RL as a refinement tool has been used in dexterous manipulation for some time!
It used to be quite hard to do tabula rasa learning for dexterous manipulation. And still is, for the most part!

Reposted by Sarvesh Patil

Pablo Samuel Castro @pcastr.bsky.social · Dec 23

The are lots of people who've influenced AI but haven't won Nobel prizes.
I discuss a tiny sliver of them in this parody of @billyjoelofficial.bsky.social 's "We didn't start the fire"...
Enjoy!

youtube.com/shorts/qDSYA...

We Didn't Win A Nobel (Billy Joel Parody)

YouTube video by MUSICODE

Sarvesh Patil @nagababa.bsky.social · Dec 20

Hard agree. Although how to reconcile that with when you're writing a paper?

Like what do you think is a reasonable process to pick "baselines"?

I've seen some students get jaded cos reviewers ask them to incl "baselines" with shitty git implementations :(

Reposted by Sarvesh Patil

Csaba Szepesvari @skiandsolve.bsky.social · Dec 19

If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.

Sarvesh Patil @nagababa.bsky.social · Dec 16

Hi! Can you please add me to the list?
Thanks for making it!

Reposted by Sarvesh Patil

ruiqigao.bsky.social @ruiqigao.bsky.social · Dec 2

A common question nowadays: Which is better, diffusion or flow matching? 🤔

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.

Sarvesh Patil @nagababa.bsky.social · Nov 30

🙋‍♂️👋

Reposted by Sarvesh Patil

Mathurin Massias @mathurinmassias.bsky.social · Nov 27

Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!

We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423

Sarvesh Patil @nagababa.bsky.social · Nov 23

Intro Post
Hello World!
I'm a 2nd year Robotics PhD student at CMU, working on distributed dexterous manipulation, accessible soft robots and sensors, sample efficient robot learning, and causal inference.

Here are my cute robots:
PS: Videos are old and sped up. They move slower in real-world :3