Lightnews — Scholar-powered news

Dane Carnegie Malenfant

@dvnxmvlhdf5.bsky.social

550 followers 460 following 79 posts

MSc. @mila-quebec.bsky.social and @mcgill.ca in the LiNC lab Fixating on multi-agent RL, Neuro-AI and decisions Ēka ē-akimiht https://danemalenfant.com/

danemalenfant.com

Posts Media Videos Starter Packs

Pinned

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · Aug 3

I am presenting this work at the @cocomarl-workshop.bsky.social part of @rl-conference.bsky.social Tuesday (: I additionally have a generalized correction term for n-arbitrary agents (it is like walking a tree for the order of gradients) that I am looking for thoughts, validations or critiques.

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · Jun 5

Preprint Alert 🚀

Multi-agent reinforcement learning (MARL) often assumes that agents know when other agents cooperate with them. But for humans, this isn’t always the case. For example, plains indigenous groups used to leave resources for others to use at effigies called Manitokan.
1/8

Manitokan are images set up where one can bring a gift or receive a gift. 1930s Rocky Boy Reservation, Montana, Montana State University photograph. Colourized with AI

2 2 8

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

Here is my plan to make Bluesky more fun and active:

Reposted by Dane Carnegie Malenfant

Kempner Institute at Harvard University @kempnerinstitute.bsky.social · 1d

NEW on our #DeeperLearning blog

People balance being kind vs. being honest — and #LLMs should too.

New research shows training choices often favor informativeness over kindness, but prompting can induce sycophancy.

Read more: bit.ly/3Wqrtxl

Using Cognitive Models to Reveal Value Trade-offs in Language Models - Kempner Institute

People’s actions and words are the result of a balance of different goals. The authors use a leading cognitive model of this value trade-off in polite speech to systematically examine […]

bit.ly

1 3 9

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

8/8
Further context in this recap captures the AI Ecologies Lab: hacnumedia.org/creer-avec-l..., raav.org/actuality/qu... and lienmultimedia.com/spip.php?art... . The festival is mutek.org

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

7/8 The takeaway for the public: consider training choices like entropy regularization can make systems more robust so fewer restarts and less costly retraining when the world shifts. This means your learning systems are more durable and efficient.

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

6/8 To make it more visually fun), I teamed up with the Société des arts technologiques sat.qc.ca to create an experience. Using open-source Ossia Score's particle clouds, audio, and 3D transforms in real time while the agents learned. ossia.io

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

5/8
Both agents must unlearn and relocate the reward peak. The entropy-max agent stays a bit uncertain, keeps exploring, so it detects the shift faster and adapts sooner.

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

4/8
To communicate this to a general audience and the #art community, I built a minimal task: two Gaussian bandits. One agent optimizes with entropy; the other doesn’t. Mid-training, the reward distribution jumps.

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

3/8
By training systems this way, agents should handle non-stationary changes better. Yet outside research circles, “AI” ≈ only LLMs or generative models. RL, on the other hand, is an unknown learning paradigm to the public’s eyes.

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

2/8
I proposed a reinforcement-learning (RL) demo: add a maximum-entropy term to increase the longevity of systems in a non-stationary environment. This is well known to the RL research community: openreview.net/forum?id=PtS...
(photo by Félix Bonne-Vie)

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 1d

1/8
A month ago I wrapped a 4-month project with MuTek Forum’s AI Ecologies Lab led by Sarah Mackenzie: the research arm of Montréal’s 25-year electronic music festival. Why entropy can make AI more resilient Event: ra.co/events/2206981

ra.co

1 1

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 6d

My eye colour apparently changed after 6 years

Reposted by Dane Carnegie Malenfant

Eugene Vinitsky 🍒 @eugenevinitsky.bsky.social · 6d

We're finally out of stealth: percepta.ai
We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, come join us 😀

Percepta | A General Catalyst Transformation Company

Transforming critical institutions using applied AI. Let's harness the frontier.

percepta.ai

7 18 120

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 7d

I am on one transformer paper from 3 years ago and ICLR flooded my bids with RLVR & RLHF :S

Reposted by Dane Carnegie Malenfant

Pete Shaw @ptshaw.bsky.social · 7d

w/ James Cohan, @jacobeisenstein.bsky.social, and Kristina Toutanova

Paper link: arxiv.org/abs/2509.22445

Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers

The Minimum Description Length (MDL) principle offers a formal framework for applying Occam's razor in machine learning. However, its application to neural networks such as Transformers is challenging...

arxiv.org

1 1

Reposted by Dane Carnegie Malenfant

Charlotte Volk @charlottevolk.bsky.social · 8d

A huge thank you to my collaborators @shahabbakht.bsky.social and Christopher Pack for their guidance on this project. We’d love to hear your thoughts and comments!

The preprint: www.biorxiv.org/content/10.1...

The curriculum effect in visual learning: the role of readout dimensionality

Generalization of visual perceptual learning (VPL) to unseen conditions varies across tasks. Previous work suggests that training curriculum may be integral to generalization, yet a theoretical explan...

www.biorxiv.org

1 6

Reposted by Dane Carnegie Malenfant

Charlotte Volk @charlottevolk.bsky.social · 8d

9. We hypothesized that the efficacy of the learning curricula depends on how many distinct, useful visual features the brain recruits to solve the task - curricula which lead learners to rely on fewer, more essential visual features will result in better generalization.

1 1 2

Reposted by Dane Carnegie Malenfant

Charlotte Volk @charlottevolk.bsky.social · 8d

5. In this study, we leveraged ANNs to develop a mechanistic predictive theory of learning generalization in humans. Specifically, we wanted to understand the role of **learning curriculum**, and develop a theory of how curriculum affects generalization.

1 1 2

Reposted by Dane Carnegie Malenfant

Charlotte Volk @charlottevolk.bsky.social · 8d

🚨 New preprint alert!

🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈

A 🧵:

tinyurl.com/yr8tawj3

The curriculum effect in visual learning: the role of readout dimensionality

tinyurl.com

1 22 69

Reposted by Dane Carnegie Malenfant

Michelle Cyca @michellecyca.com · 9d

so fucking tiresome to get emails like this whenever I write about residential school history, truly. people who believe that graves don't exist if they can't see the bodies with their own two eyes possess the critical thinking skills of a baby playing peekaboo.

I hope you are doing well.

I read your article on the Walrus on the reality of the current state on implementation of recommendations from that TRC. It truly is unfortunate that implementing these recommendations isn't proceeding with alacrity.

One are that is confusing for me is the truth around the Kamloops mass grave site. In your article, you state, "discovery of unmarked graves on the grounds of the former Kamloops Indian Residential School". However, follow up work hasn't found any mass graves. I have tried to find primary sources on discovery of actual mass graves without success.

Can you please share primary sources on this? I have spoken to others who state that though ground penetrating radar found some suggestions of graves, follow up digging did not find any actual graves.

Appreciate any help you can provide. Thank you.

6 29 140

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 10d

Hanover’s Oktoberfest honouring hip hop’s best

Reposted by Dane Carnegie Malenfant

paulseesequasis @paulseesequasis.bsky.social · 15d

'A nice day to wash clothes' | Allen Sapp (Cree) 1928-2015

Private Collection

An acrylic painting by self-taught Cree artist Allen Sapp (1928-2015) of a woman, in blue and red, kneeling by a pond, washing clothes, with birch trees around.

11 49

Reposted by Dane Carnegie Malenfant

Nanda H Krishna @nandahkrishna.bsky.social · 18d

Excited to share that POSSM has been accepted to #NeurIPS2025! See you in San Diego 🏖️

Avery HW Ryoo @averyryoo.bsky.social · Jun 6

New preprint! 🧠🤖

How do we build neural decoders that are:
⚡️ fast enough for real-time use
🎯 accurate across diverse tasks
🌍 generalizable to new sessions, subjects, and even species?

We present POSSM, a hybrid SSM architecture that optimizes for all three of these axes!

🧵1/7

1 3 10

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 18d

Particularly, I started presenting a validation experiment of the self-correction term.

Rather than "if x then y" this tested "if not x then not y".

This inhibits learning the sub-policy for maximizing collective reward. Agents compete even with a larger reward signal not to

Appendix Figure 15 with two panels. Panel a plots percent cumulative reward and collective success across 11,000 episodes for PG agents trained with a negated self-correction term: rewards trend upward but success rate decreases, with pronounced variance spikes. Panel b shows nine independent simulations (each averaged over 32 parallel environments) where reward curves dip sharply and recover, indicating agents compete for the single key and avoid dropping it, reducing collective success

Dane Carnegie Malenfant @dvnxmvlhdf5.bsky.social · 18d

I had a lot of fun attending and presenting the Challenge of Hidden Gifts at @ewrl18.bsky.social !
openreview.net/forum?id=gbs...

Tübingen is a great city and it was nice to get out of the North American machine learning bubble.

1 5

Reposted by Dane Carnegie Malenfant

Jonathan Tsay @tsay.bsky.social · 23d

The cerebellum isn’t just about coordinating movement. It’s implicated in nearly every domain of cognition—from language to social behavior.

But how exactly does the cerebellum contribute to action and cognition? 🧵

Check out our new paper w/ Rich Ivry.
arxiv.org/abs/2509.09818

Cerebellar Contributions to Action and Cognition: Prediction, Timescale, and Continuity

The cerebellum is implicated in nearly every domain of human cognition, yet our understanding of how this subcortical structure contributes to cognition remains elusive. Efforts on this front have ten...

arxiv.org

6 20 67