Lightnews — Scholar-powered news

Reposted by Maëva L'Hôtellier

bioRxiv Neuroscience @biorxiv-neursci.bsky.social · Jun 28

Interoception vs. Exteroception: Cardiac interoception competes with tactile perception, yet also facilitates self-relevance encoding https://www.biorxiv.org/content/10.1101/2025.06.25.660685v1

11 15

Reposted by Maëva L'Hôtellier

Stefano Palminteri @stepalminteri.bsky.social · Jun 11

Lucky for you, lazy people at #RLDM2025, two of the best posters have apparently been put side-by-side: go check @maevalhotellier.bsky.social and @constancedestais.bsky.social posters!

4 29

Reposted by Maëva L'Hôtellier

Romane Cecchi @romanececchi.bsky.social · Apr 22

🧵 New preprint out!
📄 "Elucidating attentional mechanisms underlying value normalization in human reinforcement learning"
👁️ We show that visual attention during learning causally shapes how values are encoded
w/ @sgluth.bsky.social & @stepalminteri.bsky.social
🔗 doi.org/10.31234/osf...

OSF

doi.org

1 6 15

Reposted by Maëva L'Hôtellier

fabiencerrotti.bsky.social @fabiencerrotti.bsky.social · Apr 16

🚨 New preprint on bioRxiv!

We investigated how the brain supports forward planning & structure learning during multi-step decision-making using fMRI 🧠

With A. Salvador, S. Hamroun, @mael-lebreton.bsky.social & @stepalminteri.bsky.social

📄 Preprint: submit.biorxiv.org/submission/p...

bioRxiv Manuscript Processing System

Manuscript Processing System for bioRxiv.

submit.biorxiv.org

2 9 23

Reposted by Maëva L'Hôtellier

LSP-ENS @lsp-ens.bsky.social · Feb 17

@magdalenasabat.bsky.social used 🔌 ephys to show that ferret auditory cortex neurons integrate sounds within fixed windows (~15–150 ms) that increase in non-primary auditory cortex, independent of information rate.
▶️ www.biorxiv.org/content/10.1...
#Neuroscience

Neurons in auditory cortex integrate information within constrained temporal windows that are invariant to the stimulus context and information rate

Much remains unknown about the computations that allow animals to flexibly integrate across multiple timescales in natural sounds. One key question is whether multiscale integration is accomplished by...

www.biorxiv.org

1 2 2

Reposted by Maëva L'Hôtellier

Ali Shiravand @alishiravand.bsky.social · Feb 16

🎉 I'm excited to share that 2 of our papers got accepted to #RLDM2025!

📄 NORMARL: A multi-agent RL framework for adaptive social norms & sustainability.
📄 Selective Attention: When attention helps vs. hinders learning under uncertainty.

Grateful to my amazing co-authors! *-*

1 7 15

Reposted by Maëva L'Hôtellier

Charley Wu | hiring PhDs/Postdocs @thecharleywu.bsky.social · Feb 10

🚨 Finally out! My new @annualreviews.bsky.social in Psychology paper:
www.annualreviews.org/content/jour...
We unpack why psych theories of generalization keep cycling from rigid rule-based models to flexible similarity-based ones, then culminating in Bayesian hybrids. Let's break it down 👉 🧵

4 41 120

Reposted by Maëva L'Hôtellier

Stefano Palminteri @stepalminteri.bsky.social · Jan 23

Epistemic biases in human reinforcement learning: behavioral evidence, computational characterization, normative status and possible applications.

A quite self-centered review, but with a broad introduction and conclusions and very cool figures.

Few main takes will follow

osf.io/preprints/ps...

1 19 38

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Link to the preprint:
osf.io/preprints/ps...

OSF

osf.io

2

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Questions or thoughts? Let’s discuss!
Reach out — we’d love to hear from you! 🙌

1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Why does it matter? 🤔
Our work aims at bridging cognitive science and machine learning, showing how human-inspired principles like reward normalization can improve reinforcement learning AI systems!

1 2

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

What about Deep Decision Trees? 🌳
We further extend the RA model by integrating a temporal difference component to the dynamic range updates. With this extension, we demonstrate that the magnitude invariance capabilities of the RA model persist in multi-step tasks.

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

With this enhanced model, we generalize the main findings to other bandit settings: The dynamic RA model outperforms the ABS model in several bandit tasks with noisy outcomes, non-stationary rewards, and even multiple options.

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Once these basic properties are demonstrated in a simplified set-up, we enhance the RA model to successfully cope with stochastic and volatile environments, by dynamically adjusting its internal range variables (Rmax / Rmin).

1 2

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

In contrast, the RA model, by constraining all rewards to a similar scale, efficiently balances exploration and exploitation without the need for task-specific adjustment!

1 1 2

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Crucially, modifying the value of the temperature (𝛽) from the Softmax function does not solve the problem of the standard model. It simply shifts the peak performance along the magnitude axis.
Thus, to achieve high performance, the ABS model requires tuning the 𝛽 value to the magnitudes at stake.

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Agent-Level Insights: ABS performance drops to chance due to over-exploration in small rewards and over-exploitation in large rewards.
In contrast, the RA model maintains a consistent, scale-invariant performance.

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

First, we simulate ABS and RA behavior in bandits tasks with various magnitude and discriminability levels.

As expected the standard model is highly dependent on the tasks levels, while the RA model achieves high accuracy over the whole range of values tested!

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

To avoid magnitude-dependence, we propose the Range-Adapted (RA) model: RA normalizes rewards, enabling consistent representation of subjective values within a constrained space, independent of reward magnitude.

1 1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

Standard reinforcement learning algorithms encode rewards in an unbiased, absolute manner (ABS), which make their performance magnitude-dependent.

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

This work was done in collaboration with Jérémy Pérez, under the supervision of @stepalminteri.bsky.social 👥

Let's now dive into the study!

1 1

Maëva L'Hôtellier @maevalhotellier.bsky.social · Dec 10

New preprint! 🚨

Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize.
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization.
Curious? Have a read!👇

1 7 14

Reposted by Maëva L'Hôtellier

Stefano Palminteri @stepalminteri.bsky.social · Dec 5

🚨New preprint alert!🚨

Achieving Scale-Invariant Reinforcement Learning Performance with Reward Range Normalization.

Where we show that things we discover in psychology can be useful for machine learning.

By the amazing
@maevalhotellier.bsky.social and Jeremy Perez.
doi.org/10.31234/osf...

OSF

osf.io

1 10 23