Lucas Alegre
@lnalegre.bsky.social
2K followers 170 following 30 posts
Professor at INF - @ufrgs.br | Ph.D. in Computer Science. I am interested in multi-policy reinforcement learning (RL) algorithms. Personal page: https://lucasalegre.github.io
Posts Media Videos Starter Packs
Pinned
lnalegre.bsky.social
It is really cool to see our work on multi-step GPI being cited in this amazing survey! :)

proceedings.neurips.cc/paper_files/...
lnalegre.bsky.social
On average I have a good score, but it has happened to me before to have 3/4 reviewers accepting the paper, and 1 negative reviewer convincing the AC to reject.
lnalegre.bsky.social
And now I got the classic rebuttal response:

"I have no concerns with the paper, all the theory is great, but since you did not run experiments in expensive domains with image-based environments, I will not increase my score".

The goal of experiments is to validate the claims! Not to beat Atari!
lnalegre.bsky.social
I got the classic NeurIPS reviews "why did you not compare with [completely unrelated method whose comparison would not help support any of the paper's claim]?"

Questioning myself whether I should spend my weekend running this useless experiment or if I should argue with the reviewer.
lnalegre.bsky.social
Finally, reporting only IQM may compromise scientific transparency and fairness, as it can mask poor or unstable performance. Agarwal et al. (2021), who introduced IQM in this context, recommend using it in conjunction with other statistics rather than as a standalone measure.
lnalegre.bsky.social
Yes, Interquartile Mean (IQM) is a robust statistic that reduces the influence of outliers. But it does not by itself provide a clear and fair analysis of performance. In particular, IQM does not capture the full distribution of returns and may hide important information about variability and risk.
lnalegre.bsky.social
While I really like the paper "Deep Reinforcement Learning at the Edge of the Statistical Precipice" (openreview.net/forum?id=uqv...), I have seen papers evaluating performance using only the IQM metric and claiming that it is a fairer metric than the mean based on this paper, which is simply wrong.
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Our findings call for a change in how we report performance on benchmarks when using only a few runs, for which we present more reliable protocols accompanied with an open-source library.
openreview.net
lnalegre.bsky.social
This work was done during my time as an intern at Disney Research Zürich. It was amazing and really fun to develop this idea with the Robotics Team!
lnalegre.bsky.social
Check out AMOR now on arXiv:

Paper: arxiv.org/abs/2505.23708
Full Video: youtube.com/watch?v=gQid...

#SIGGRAPH2025 #RL #robotics
lnalegre.bsky.social
A base policy with uniform weights might fail on challenging motions, but with a few weight tweaks, it nails them. Like this double spin. 🌀😵‍💫

Curious how tuning weights mid-motion can help improve the sim-to-real gap and unlock dynamic, expressive behaviors?
lnalegre.bsky.social
AMOR trains a single policy conditioned on reward weights and motion context, letting you fine-tune the reward after training.
Want smoother motions? Better accuracy? Just adjust the weights — no retraining needed!
lnalegre.bsky.social
We are excited to share our #SIGGRAPH2025 paper,

“AMOR: Adaptive Character Control through Multi-Objective Reinforcement Learning”!
Lucas Alegre*, Agon Serifi*, Ruben Grandia, David Müller, Espen Knoop, Moritz Baecher
lnalegre.bsky.social
Thank you, Peter! :)
lnalegre.bsky.social
I'm really glad to have been selected as one of the ICML 2025 Top Reviewers!

Too bad I won't be able to go since my last submission was not accepted, even with scores Accept, Accept, Weak Accept, and Weak Reject 🫠
lnalegre.bsky.social
Last week, I was at @khipu-ai.bsky.social in Santiago, Chile. It was really amazing to see so many great speakers and researchers from Latin America together!
lnalegre.bsky.social
Finally, I would like to thank my advisors, Prof. Ana Bazzan and Prof. Bruno C. da Silva; Prof. Ann Nowé who received me at VUB for my PhD stay; and Disney Research Zürich, where I interned.

I am very grateful to everyone with that I had the chance to collaborate in all such amazing projects! 💙
lnalegre.bsky.social
I believe all these contributions open room for many interesting ideas for multi-policy RL methods. Especially in transfer learning (SFs&GPI) and multi-objective RL settings! 🚀
lnalegre.bsky.social
Besides the theoretical and algorithmic contributions, we also introduced an open-source toolkit for MORL research!

NeurIPS D&B 2023 Paper - openreview.net/pdf?id=jfwRL...
openreview.net
lnalegre.bsky.social
Next, we further explored how to leverage approximate models of the environment to improve zero-shot policy transfer. Our method, ℎ-GPI, interpolates between model-free GPI and fully model-based planning as a function of the planning horizon ℎ.

NeurIPS 2023 Paper - openreview.net/pdf?id=KFj0Q...
openreview.net