Anastasiia Pedan
@pedanana.bsky.social
55 followers 9 following 7 posts
Posts Media Videos Starter Packs
pedanana.bsky.social
my main takeaway from a talk on reward design in rl: ai only beat humans when they were asked not to collaborate 👀👀
pedanana.bsky.social
thank you, Claas, you're the best mentor I could've asked for!!!!
pedanana.bsky.social
This was an amazing collaboration with a cracked team consisting of @cvoelcker.bsky.social, me, Arash Ahmadian, Romina Abachi, @igilitschenski.bsky.social, and @sologen.bsky.social

#ReinforcementLearning #ModelBasedRL #RLTheory #ICML2025
pedanana.bsky.social
We can correct the MuZero loss and other losses from the same family by pushing the value estimates computed from different sampled model rollouts to have the correct variance and mean. We prove the soundness of this change and show that it is beneficial for agent performance 📈📈📈!
pedanana.bsky.social
Getting a correct value estimate is instrumental in model-based RL, so if your algorithm fails to provide correct targets for model learning, your agent is in trouble because these errors will accumulate fast 📉📉📉!
pedanana.bsky.social
Would you be surprised to learn that many empirical implementations of value-aware model learning (VAML) algos, including MuZero, lead to incorrect model & value functions when training stochastic models 🤕? In our new @icmlconf.bsky.social 2025 paper, we show why this happens and how to fix it 🦾!