Gautier Hamon
@hamongautier.bsky.social
54 followers 80 following 11 posts
PhD student at INRIA Flowers team. MVA master reytuag.github.io/gautier-hamon/
Posts Media Videos Starter Packs
Pinned
hamongautier.bsky.social
1/⚡️Looking for a fast and simple Transformer baseline for your RL environment in JAX ?
Sharing my implementation of transformerXL-PPO: github.com/Reytuag/tran...
The implementation is the first to attain the 3rd floor and obtain advanced achievements in the challenging Craftax
Reposted by Gautier Hamon
mcvjetko.bsky.social
Complex cell-like structures in Flow Lenia
Reposted by Gautier Hamon
cartathomas.bsky.social
🚀 Introducing 🧭MAGELLAN—our new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.🌍✨Learn more: 🔗 arxiv.org/abs/2502.07709 #OpenEndedLearning #LLM #RL
MAGELLAN: Metacognitive predictions of learning progress guide...
Open-ended learning agents must efficiently prioritize goals in vast possibility spaces, focusing on those that maximize learning progress (LP). When such autotelic exploration is achieved by LLM...
arxiv.org
Reposted by Gautier Hamon
ccolas.bsky.social
we are recruiting interns for a few projects with @pyoudeyer
in bordeaux
> studying llm-mediated cultural evolution with @nisioti_eleni
@Jeremy__Perez

> balancing exploration and exploitation with autotelic rl with @ClementRomac

details and links in 🧵
please share!
hamongautier.bsky.social
8/ For the curious, here are the achievements success rate on craftax across training, training for 1e9 steps (left) and training for 4e9 steps (right).
1e9 steps on craftax with transformerXL PPO 4e9 steps on craftax with transformerXL PPO
hamongautier.bsky.social
7/ The JAX ecosystem in RL is currently blooming with wonderful open-sources projects from others that I linked at the bottom of the repository. github.com/Reytuag/tran...
This work was done at @FlowersINRIA
.
Also feel free to reach me if you have questions or suggestions !
GitHub - Reytuag/transformerXL_PPO_JAX
Contribute to Reytuag/transformerXL_PPO_JAX development by creating an account on GitHub.
github.com
hamongautier.bsky.social
6/ Potential next steps could be to test it on Xland-Minigrid
, to test it on an Open-Ended meta-RL environment github.com/dunnolab/xla...
I'm also curious to implement Muesli (arxiv.org/abs/2104.06159) with transformerXL as in arxiv.org/abs/2301.07608
hamongautier.bsky.social
5/Here is the training curve obtained from training for 1e9 steps, reporting the scores from PPO and PPO-RNN provided in the craftax repo.
Noting that PPO-RNN was already beating other baselines with Unsupervised Environment Design and intrinsic motivation. arxiv.org/pdf/2402.16801
hamongautier.bsky.social
4/ Testing it on the challenging Craftax from github.com/MichaelTMatt...
(with little hyperparameter tuning), it obtained higher returns in 1e9 steps than PPO-RNN.
Training it for longer, led to the 3rd floor in craftax, making it the first to get advanced achievements.
GitHub - MichaelTMatthews/Craftax: (Crafter + NetHack) in JAX. ICML 2024 Spotlight.
(Crafter + NetHack) in JAX. ICML 2024 Spotlight. Contribute to MichaelTMatthews/Craftax development by creating an account on GitHub.
github.com
hamongautier.bsky.social
3/
Training a 3M parameters Transformer for 1e6 steps in MemoryChain-bsuite (from gymnax) takes 10s on a A100. (with 512 env)
Training a 5M parameters Transformer for 1e9 steps in craftax takes ~6h on a single A100. (with 1024 envs)
We also support multi-GPU training.
hamongautier.bsky.social
1/⚡️Looking for a fast and simple Transformer baseline for your RL environment in JAX ?
Sharing my implementation of transformerXL-PPO: github.com/Reytuag/tran...
The implementation is the first to attain the 3rd floor and obtain advanced achievements in the challenging Craftax
hamongautier.bsky.social
The video encoding might not do it full justice.
Paper: direct.mit.edu/isal/proceed...
hamongautier.bsky.social
Putting some Flow Lenia here too
Reposted by Gautier Hamon
handle.invalid
Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD
Reposted by Gautier Hamon
nicolasyax.bsky.social
🚨New preprint🚨
When testing LLMs with questions, how can we know they did not see the answer in their training? In this new paper we propose a simple out of the box and fast method to spot contamination on short texts with @stepalminteri.bsky.social and Pierre-Yves Oudeyer !