Maëva L'Hôtellier
@maevalhotellier.bsky.social
180 followers 160 following 14 posts
Studying learning and decision-making in humans | HRL team - ENS Ulm |
Posts Media Videos Starter Packs
Reposted by Maëva L'Hôtellier
biorxiv-neursci.bsky.social
Interoception vs. Exteroception: Cardiac interoception competes with tactile perception, yet also facilitates self-relevance encoding https://www.biorxiv.org/content/10.1101/2025.06.25.660685v1
Reposted by Maëva L'Hôtellier
stepalminteri.bsky.social
Lucky for you, lazy people at #RLDM2025, two of the best posters have apparently been put side-by-side: go check @maevalhotellier.bsky.social and @constancedestais.bsky.social posters!
Reposted by Maëva L'Hôtellier
romanececchi.bsky.social
🧵 New preprint out!
📄 "Elucidating attentional mechanisms underlying value normalization in human reinforcement learning"
👁️ We show that visual attention during learning causally shapes how values are encoded
w/ @sgluth.bsky.social & @stepalminteri.bsky.social
🔗 doi.org/10.31234/osf...
OSF
doi.org
Reposted by Maëva L'Hôtellier
fabiencerrotti.bsky.social
🚨 New preprint on bioRxiv!

We investigated how the brain supports forward planning & structure learning during multi-step decision-making using fMRI 🧠

With A. Salvador, S. Hamroun, @mael-lebreton.bsky.social & @stepalminteri.bsky.social

📄 Preprint: submit.biorxiv.org/submission/p...
bioRxiv Manuscript Processing System
Manuscript Processing System for bioRxiv.
submit.biorxiv.org
Reposted by Maëva L'Hôtellier
alishiravand.bsky.social
🎉 I'm excited to share that 2 of our papers got accepted to #RLDM2025!

📄 NORMARL: A multi-agent RL framework for adaptive social norms & sustainability.
📄 Selective Attention: When attention helps vs. hinders learning under uncertainty.

Grateful to my amazing co-authors! *-*
Reposted by Maëva L'Hôtellier
thecharleywu.bsky.social
🚨 Finally out! My new @annualreviews.bsky.social in Psychology paper:
www.annualreviews.org/content/jour...
We unpack why psych theories of generalization keep cycling from rigid rule-based models to flexible similarity-based ones, then culminating in Bayesian hybrids. Let's break it down 👉 🧵
Reposted by Maëva L'Hôtellier
stepalminteri.bsky.social
Epistemic biases in human reinforcement learning: behavioral evidence, computational characterization, normative status and possible applications.

A quite self-centered review, but with a broad introduction and conclusions and very cool figures.

Few main takes will follow

osf.io/preprints/ps...
maevalhotellier.bsky.social
Questions or thoughts? Let’s discuss!
Reach out — we’d love to hear from you! 🙌
maevalhotellier.bsky.social
Why does it matter? 🤔
Our work aims at bridging cognitive science and machine learning, showing how human-inspired principles like reward normalization can improve reinforcement learning AI systems!
maevalhotellier.bsky.social
What about Deep Decision Trees? 🌳
We further extend the RA model by integrating a temporal difference component to the dynamic range updates. With this extension, we demonstrate that the magnitude invariance capabilities of the RA model persist in multi-step tasks.
maevalhotellier.bsky.social
With this enhanced model, we generalize the main findings to other bandit settings: The dynamic RA model outperforms the ABS model in several bandit tasks with noisy outcomes, non-stationary rewards, and even multiple options.
maevalhotellier.bsky.social
Once these basic properties are demonstrated in a simplified set-up, we enhance the RA model to successfully cope with stochastic and volatile environments, by dynamically adjusting its internal range variables (Rmax / Rmin).
maevalhotellier.bsky.social
In contrast, the RA model, by constraining all rewards to a similar scale, efficiently balances exploration and exploitation without the need for task-specific adjustment!
maevalhotellier.bsky.social
Crucially, modifying the value of the temperature (𝛽) from the Softmax function does not solve the problem of the standard model. It simply shifts the peak performance along the magnitude axis.
Thus, to achieve high performance, the ABS model requires tuning the 𝛽 value to the magnitudes at stake.
maevalhotellier.bsky.social
Agent-Level Insights: ABS performance drops to chance due to over-exploration in small rewards and over-exploitation in large rewards.
In contrast, the RA model maintains a consistent, scale-invariant performance.
maevalhotellier.bsky.social
First, we simulate ABS and RA behavior in bandits tasks with various magnitude and discriminability levels.

As expected the standard model is highly dependent on the tasks levels, while the RA model achieves high accuracy over the whole range of values tested!
maevalhotellier.bsky.social
To avoid magnitude-dependence, we propose the Range-Adapted (RA) model: RA normalizes rewards, enabling consistent representation of subjective values within a constrained space, independent of reward magnitude.
maevalhotellier.bsky.social
Standard reinforcement learning algorithms encode rewards in an unbiased, absolute manner (ABS), which make their performance magnitude-dependent.
maevalhotellier.bsky.social
This work was done in collaboration with Jérémy Pérez, under the supervision of @stepalminteri.bsky.social 👥

Let's now dive into the study!
maevalhotellier.bsky.social
New preprint! 🚨

Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize.
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization.
Curious? Have a read!👇
Reposted by Maëva L'Hôtellier
stepalminteri.bsky.social
🚨New preprint alert!🚨

Achieving Scale-Invariant Reinforcement Learning Performance with Reward Range Normalization.

Where we show that things we discover in psychology can be useful for machine learning.

By the amazing
@maevalhotellier.bsky.social and Jeremy Perez.
doi.org/10.31234/osf...
OSF
osf.io