Jean Barré
@jbarre.bsky.social
410 followers 430 following 65 posts
PhD student @ École Normale Supérieure in Paris. Working in the Computational Literary Studies field on literary evolution of novel subgenres & canonization process + fr-BookNLP implementation w/ @labolattice.bsky.social‬ https://crazyjeannot.github.io/
Posts Media Videos Starter Packs
Reposted by Jean Barré
dorialexander.bsky.social
And new paper out: Pleias 1.0: the First Family of Language Models Trained on Fully Open Data

How we train an open everything model on a new pretraining environment with releasable data (Common Corpus) with an open source framework (Nanotron from HuggingFace).

www.sciencedirect.com/science/arti...
Reposted by Jean Barré
jbcamps.bsky.social
We're officially launching the new PSL CultureLab in 10 days !
If you're interested in the research of a collective bridging Computational Humanities, Social Sciences and Cultural Evolution, you can check our programme (and come to our event, if you're in Paris 22 September):
psl.eu/agenda/collo...
Colloque inaugural du Grand programme de recherche CultureLab | PSL
Recherche, CultureLab inaugure ses travaux le 22 septembre 2025 au Campus Condorcet avec une journée consacrée aux sciences humaines et sociales computationnelles et à l’évolution culturelle. , Le Gra...
psl.eu
Reposted by Jean Barré
artjomshl.bsky.social
✍️ Our paper is finally out!

All poetic forms come from somewhere, but figuring out their relationships is hard.

We use sequence alignment on scansion (010.10) to measure metrical similarity between poems. This allows us to detect related forms across languages and times 1/
tinyurl.com/metronome25
Metronome: tracing variation in poetic meters via local sequence alignment | Computational Humanities Research | Cambridge Core
Metronome: tracing variation in poetic meters via local sequence alignment - Volume 1
www.cambridge.org
Reposted by Jean Barré
tedunderwood.com
New this morning, a Comment I contributed to Nature Computational Science on the interaction between large language models and the humanities. 🧪 🤖 #MLSky

rdcu.be/etk07

The link above will be open-access for a month — plus, I'll reply to this post with a link to a permanently open preprint. +
The impact of language models on the humanities and vice versa
Nature Computational Science - Many humanists are skeptical of language models and concerned about their effects on universities. However, researchers with a background in the humanities are also...
rdcu.be
jbarre.bsky.social
I had fun presenting some of my PhD obsessions about the french detective novel in Würzburg.
Thank you @fotisjannidis.bsky.social for the invitation ! The whole team is impressive, brand new building and talented people, the future of DH is actually here 🤩
Reposted by Jean Barré
lucy3.bsky.social
"Tell, Don't Show" was accepted to #ACL2025 Findings! 



Our conceptually intuitive, lightweight approach for literary topic modeling combines the new (language models) with the old (classic LDA) to yield better topics. ✨📚 arxiv.org/abs/2505.23166
A screenshot of the title, authors, abstract, and figure 1 of the paper. Text in the abstract begins with: "Conventional bag-of-words approaches for topic modeling, like latent Dirichlet allocation (LDA), struggle with literary text. Literature
challenges lexical methods because narrative language focuses on immersive sensory details instead of abstractive description or exposition: writers are advised to show, don’t tell. We propose Retell, a simple, accessible topic modeling approach for literature. Here, we prompt resource-efficient, generative language models (LMs) to tell what passages show, thereby translating narratives’ surface forms into higher-level concepts and themes."
jbarre.bsky.social
That’s all for now! This is just preliminary research on the formal evolution of French literature—stay tuned for more!
jbarre.bsky.social
What drives this tense-revolution? Our OLS models reveal it’s genre above all, with time period and canonicity as secondary factors. 🔍
Regression coefficients for subgenres, publication periods & “canonical” status—bars show effect sizes relative to the baseline. Genres explain the lion’s share of the shift!
jbarre.bsky.social
This article evaluates different drivers of change: genre conventions, canonicity, or simple temporal drift?🤔
First, we mapped the passé simple’s co-movement over 200 years 🔄, revealing a clear split between past tenses (imparfait, plus-que-parfait) and present tenses (présent, futur, passé composé)
Smoothed correlations of passé simple with past tenses (imparfait, plus-que-parfait) in warm colors vs. present tenses (présent, futur, passé composé) in cool colors. Notice their uncoupling around WWII—hinting at a shift from narrated to "discussed" modes of storytelling.
jbarre.bsky.social
New little paper “The times are a-changin’: présent vs passé simple in French novels (1811–2024)”👉 hal.science/hal-04984105
With Simon Gabay and @floriancafiero.bsky.social
#dhbenelux2025
In french fiction, use of past tenses (especially the passé simple) collapsed over the last 150 years.. so why?
The times are a-changin': présent vs passé simple in French novels (1811-2024)
The use of présent and passé simple in French has undergone profound changes in recent centuries. By means of a large corpus of novels, we observe major trends that we attempt to describe and explain....
hal.science
Reposted by Jean Barré
rnv.bsky.social
🚨New pre-print 🚨

News articles often convey different things in text vs. image. Recent work in computational framing analysis has analysed the article text but the corresponding images in those articles have been overlooked.
We propose multi-modal framing analysis of news: arxiv.org/abs/2503.20960
jbarre.bsky.social
Les deux places / embranchements du pont de l'Alma, surtout au Nord - Soit la circulation y est complètement bouchée, soit la vitesse des motorisés est >50. Les avenues qui en découlent ne sont pas mieux (av Georges 5, av Rapp, +av Bosquet interdite aux cyclistes, wtf).
Reposted by Jean Barré
comphumresearch.bsky.social
🚨 Our Call for Papers is out! 🚨

We continue our tradition of providing a dedicated platform for presenting computational work that bridges formal methods and traditional inquiry in the arts and humanities.

Check out the website for all details: 2025.computational-humanities-research.org/cfp/
Reposted by Jean Barré
paulecohen.bsky.social
Le Monde reporting that a French scientist traveling to Houston to attend a conference was denied entry to US after a search of his phone & computer revealed messages critical of Trump's science cuts, "which [says CPB] conveyed hatred of Trump & could be qualified as terrorism". Computer confiscated
Reposted by Jean Barré
mellymeldubs.bsky.social
Excited to share our preprint "Provocations from the Humanities for Generative AI Research”

We're open to feedback—read & share thoughts!

@laurenfklein.bsky.social @mmvty.bsky.social @docdre.distributedblackness.net @mariaa.bsky.social @jmjafrx.bsky.social @nolauren.bsky.social @dmimno.bsky.social
Screenshot of the first page of preprint, "Provocations from the Humanities for Generative AI Research," by Lauren Klein, Meredith Martin, Andre Brock, Maria Antoniak, Melanie Walsh, Jessica Marie Johnson, Lauren Tilton, and David Mimno
Reposted by Jean Barré
acerbialberto.com
New cultural evolution modelling paper with @bdecourson.bsky.social on @pnas.org!
"Weak individual preferences stabilize culture"
A quick 🧵
www.pnas.org/doi/10.1073/...
Reposted by Jean Barré
sobchuk.bsky.social
Change over time is often depicted as a trendline. But what does shape a trendline? Which forces? Our new paper presents a method allowing to “decompose” trendlines into constituent forces. Also, we tackle an old puzzle: Does culture change “one funeral at a time”? 🧵(1/8) doi.org/10.1098/rspb...
a schematic depiction of a trend line and several causal forces that give it its shape
Reposted by Jean Barré
Reposted by Jean Barré
wjbmattingly.bsky.social
My video on spaCy layout is now out! This is probably my favorite update from @explosion-ai.bsky.social (and that's saying something!) This package makes it simple to do region detection, table detection, and OCR with just 1 line of Python.

Video: youtu.be/quJtzVxoMtE

#MachineLearning
Best Way to OCR a PDF in Python - spaCy Layout
YouTube video by Python Tutorials for Digital Humanities
youtu.be
jbarre.bsky.social
Little baseline for french - 14 years of mean absolute error
Good old sklearn linear regression, 3000 novels
A rushy but successful experiment in front of students 🪄
Reposted by Jean Barré
howard.fm
I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵
jbarre.bsky.social
Pining @oseminck.bsky.social new bluesky account - Go follow her !
Reposted by Jean Barré
dorialexander.bsky.social
“They said it could not be done”. We’re releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.
jbarre.bsky.social
Thank you #chr2024 ! What a week 🤩 - problem now is that I have 15 new article ideas, and I might write 4 new PhD proposals 😭