Lightnews — Scholar-powered news

Abhishek Sharma

@abhishekshar.bsky.social

43 followers 200 following 7 posts

CS PhD @Harvard w/ Finale Doshi-Velez | Research in {Reinforcement Learning | Healthcare | Representation Learning} 🌐 https://abhishekshar.com/

abhishekshar.com

Posts Media Videos Starter Packs

Abhishek Sharma @abhishekshar.bsky.social · Jan 23

Our paper: Decision-Point Guided Safe Policy Improvement
We show that a simple approach to learn safe RL policies can outperform most offline RL methods. (+theoretical guarantees!)

How? Just allow the state-actions that have been seen enough times! 🤯

arxiv.org/abs/2410.09361

Decision-Point Guided Safe Policy Improvement

Within batch reinforcement learning, safe policy improvement (SPI) seeks to ensure that the learnt policy performs at least as well as the behavior policy that generated the dataset. The core challeng...

arxiv.org

1 3

Abhishek Sharma @abhishekshar.bsky.social · Jan 23

Going to my first ever AISTATS in Thailand this year! 🎉🌴

1 1

Abhishek Sharma @abhishekshar.bsky.social · Dec 9

Wow this is amazing! Thanks for sharing!

Reposted by Abhishek Sharma

Reinforcement Learning Conference @rl-conference.bsky.social · Dec 2

The call for papers for RLC is now up! Abstract deadline of 2/14, submission deadline of 2/21!
Please help us spread the word.
rl-conference.cc/callforpaper...

RLJ | RLC Call for Papers

rl-conference.cc

1 18 54

Reposted by Abhishek Sharma

Kempner Institute at Harvard University @kempnerinstitute.bsky.social · Dec 3

NEW: we have an exciting opportunity for a tenure-track professor at the #KempnerInstitute and the John A. Paulson School of Engineering and Applied Sciences (SEAS). Read the full description & apply today: academicpositions.harvard.edu/postings/14362
#ML #AI

19 20

Abhishek Sharma @abhishekshar.bsky.social · Nov 27

I was today years old when I realized that when people use log-sum-exp instead of the softmax function in the soft-Bellman operator, they are probably going for the mellowmax function!

Abhishek Sharma @abhishekshar.bsky.social · Nov 27

How are public comments a thing?? @iclr-conf.bsky.social

I can see why they can help get feedback from the crowds, but then why even have a double-blind process?

Abhishek Sharma @abhishekshar.bsky.social · Nov 22

The notes are great! Thank you!

Abhishek Sharma @abhishekshar.bsky.social · Nov 22

Would be cool to be included! (work on RL in healthcare)..