Lightnews — Scholar-powered news

Reposted by Nathan Godey

Rachel Bawden

@rachelbawden.bsky.social

Read Nathan's thread and (bsky.app/profile/nthn...) to get more details and the paper to get an even better picture: arxiv.org/abs/2510.25771.

Nathan Godey @nthngdy.bsky.social · 17d

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

November 12, 2025 at 11:18 PM

Reposted by Nathan Godey

Rachel Bawden

@rachelbawden.bsky.social

Congratulations to @nthngdy.bsky.social, @wissamantoun.bsky.social and Rian Touchent (who worked under the supervision of @zehavoc.bsky.social, @bensagot.bsky.social, Éric de La Clergerie and me) on the training of these generative models for French, English and code.

November 12, 2025 at 11:18 PM

Reposted by Nathan Godey

Benoît Sagot

@bensagot.bsky.social

I'm proud to share that at @inriaparisnlp.bsky.social we have released Gaperon — a suite of generative language models trained on French, English and code data, the largest of which has 24 billion parameters. Both the models and the code are being published under open licences. Short thread🧵

Inria Paris NLP (ALMAnaCH team) @inriaparisnlp.bsky.social · 13d

We are proud to announce that we trained 1.5B, 8B, and 24B generative language models from scratch on 2 to 4 tera-tokens of carefully curated, high-quality data covering French, English and code. We release our models and code under open-source licences. Thread👇

Summary of the GAPERON-8B training run. Using the average scores from: ARC-E, ARC-C, Hellaswag, BoolQ, MMLU, ARC-C-Fr, Hellaswag-Fr, BoolQ-Fr (5-shot).

November 12, 2025 at 5:26 PM

Reposted by Nathan Godey

Inria Paris NLP (ALMAnaCH team)

@inriaparisnlp.bsky.social

We are proud to announce that we trained 1.5B, 8B, and 24B generative language models from scratch on 2 to 4 tera-tokens of carefully curated, high-quality data covering French, English and code. We release our models and code under open-source licences. Thread👇

November 12, 2025 at 5:05 PM

Nathan Godey

@nthngdy.bsky.social

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

November 7, 2025 at 9:11 PM

Reposted by Nathan Godey

Inria Paris NLP (ALMAnaCH team)

@inriaparisnlp.bsky.social

🏆🤩 We are excited to share the news that @nthngdy.bsky.social, supervised by @bensagot.bsky.social and Éric de la Clergerie, has received the 2025 ATALA Best PhD Dissertation Prize!

You can read his PhD online here: hal.science/tel-04994414/

Nathan Godey receiving the 2025 ATALA best thesis prize at CORIA-TALN 2025.

July 17, 2025 at 9:40 AM

Nathan Godey

@nthngdy.bsky.social

🚀 New Paper Alert! 🚀

We introduce Q-Filters, a training-free method for efficient KV Cache compression!

It is compatible with FlashAttention and can compress along generation which is particularly useful for reasoning models ⚡

TLDR: we make Streaming-LLM smarter using the geometry of attention

March 6, 2025 at 4:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news