Lightnews — Scholar-powered news

Marine Carpuat @marinecarpuat.bsky.social · Jul 30

I'm so happy to see this and so sad I missed the talk! The field would not be the same without Kathy in so many ways, from her research contributions to her generous mentorship of many of us beyond her advisees.

1

Marine Carpuat @marinecarpuat.bsky.social · Jun 18

[email protected]‬, Mary Nurminen, Doug Oard, @amelija166mp.bsky.social‬, Michel Simard, @yvofr.bsky.social‬

Marine Carpuat @marinecarpuat.bsky.social · Jun 18

This is a big team effort with

Omri Asscher, Kalika Bali, @luisabentivogli.bsky.social, Frédéric Blain, @bowkerl.bsky.social‬, Monojit Choudhury, @haldaume3.bsky.social, Kevin Duh, Ge Gao, Alvin Grissom II‬, @markar.bsky.social‬, Elaine C. Khoong, ‪@wildlewis.bsky.social...

1 1 1

Marine Carpuat @marinecarpuat.bsky.social · Jun 18

We started this conversation at an NII Shonan seminar last year and wrote a survey highlighting key directions that emerged. Let us know what you think!

arxiv.org/abs/2506.13468

An Interdisciplinary Approach to Human-Centered Machine Translation

Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...

arxiv.org

1

Marine Carpuat @marinecarpuat.bsky.social · Jun 18

We argue for a human-centered approach: MT shouldn’t just produce correct outputs—they should support diverse users, goals, and contexts.

But let's not start from scratch: Translation Studies and HCI offer a wealth of theoretical and empirical work to rethink MT as a socio-technical problem.

1

Marine Carpuat @marinecarpuat.bsky.social · Jun 18

What should Machine Translation research look like in the age of multilingual LLMs?

Here’s one answer from researchers across NLP/MT, Translation Studies, and HCI.
"An Interdisciplinary Approach to Human-Centered Machine Translation"
arxiv.org/abs/2506.13468

An Interdisciplinary Approach to Human-Centered Machine Translation

Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...

arxiv.org

1 7 18

Marine Carpuat @marinecarpuat.bsky.social · Jun 17

Welcome @sarahwiegreffe.bsky.social !!!

1

Marine Carpuat @marinecarpuat.bsky.social · Jun 16

Tell me you're in France without telling me: live news coverage of high school philosophy exams!
2025 Bac de philo questions:
- Notre avenir dépend-il de la technique ?
- La vérité est-elle toujours convaincante ?
- Or discuss an excerpt from Rawls’ Theory of Justice.
www.lemonde.fr/campus/live/...

En direct, bac de philo 2025. Les réponses à vos questions sur les sujets : « L’épreuve écrite de philosophie valide la capacité des élèves à prendre un peu de recul sur des questions dont la réponse ...

« Notre avenir dépend-il de la technique ? », « La vérité est-elle toujours convaincante ? », ou encore « Avons-nous besoin de l’art ? » en dissertation, John Rawls et Adam Smith en explication de tex...

www.lemonde.fr

1 1 3

Marine Carpuat @marinecarpuat.bsky.social · Jun 13

Disagreement between LLMs can be a strength! @dayeonki.bsky.social shows that having multiple LLMs debate improves their answers to culturally variable social norm questions. #ACL2025

Dayeon (Zoey) Ki @dayeonki.bsky.social · Jun 12

1/ Are two #LLMs better than one for equitable cultural alignment? 🌍

We introduce a Multi-Agent Debate framework — where two LLM agents debate the cultural adaptability of a given scenario.

#ACL2025 🧵👇

2 6

Reposted by Marine Carpuat

Dayeon (Zoey) Ki @dayeonki.bsky.social · May 21

1/ How can a monolingual English speaker 🇺🇸 decide if an automatic French translation 🇫🇷 is good enough to be shared?

Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️

#ACL2025

1 2 1

Marine Carpuat @marinecarpuat.bsky.social · May 21

Life around submission deadlines is much more sane for me when we have a group paper clinic early (2 weeks before the deadline). Sometimes we even have an "extended abstract" clinic 1 month earlier.

That said, I did not follow my own advice this cycle, and I am now in recovery mode too!

4

Reposted by Marine Carpuat

Dayeon (Zoey) Ki @dayeonki.bsky.social · Apr 17

🚨 New Paper 🚨

1/ We often assume that well-written text is easier to translate ✏️

But can #LLMs automatically rewrite inputs to improve machine translation? 🌍

Here’s what we found 🧵

1 4 8

Reposted by Marine Carpuat

Wissam Antoun @wissamantoun.bsky.social · Apr 14

ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:

3 15 46

Marine Carpuat @marinecarpuat.bsky.social · Mar 20

Congratulations @arijriabi.bsky.social! 🎉

3

Reposted by Marine Carpuat

Wissam Antoun @wissamantoun.bsky.social · Nov 15

CamemBERT 2.0: A Smarter French 🇫🇷 Language Model Aged to Perfection 👌

We release a much-needed update for the previous. SOTA French encoder LM.

We introduce two new models CamemBERTa-v2 and CamemBERT-v2, based on the DeBERTaV3 and RoBERTa recipe.

So what's new?

[1/8]

1 10 20

Reposted by Marine Carpuat

Inria Paris NLP (ALMAnaCH team) @inriaparisnlp.bsky.social · Mar 5

We are happy to announce our next seminar, given by Florian Cafiero @floriancafiero.bsky.social (PSL @ecoledeschartes.bsky.social) entitled "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" on Friday 7th March at 11am CET. Details here: t.co/pPbWfkALM4!

Florian Cafiero - "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" - ALMAnaCH seminar 7th March 2025 at 11am CET

1 3 9

Reposted by Marine Carpuat

Rock Pang @rockpang.bsky.social · Jan 31

🤔 Interested in how #HCI thinks about using #LLMs, or looking to understand best practices for human-LLM interaction?

🚨🚨New paper: Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review 🧵

1 7 17

Reposted by Marine Carpuat

IWSLT @iwslt.bsky.social · Jan 28

First up, a new task for 2025:
*Instruction-following for speech processing!*

Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.

🔗: iwslt.org/2025/instruc...

Instruction-following Speech Processing track

Home of the IWSLT conference and SIGSLT.

iwslt.org

1 6 8

Reposted by Marine Carpuat

IWSLT @iwslt.bsky.social · Jan 28

We are pleased to announce that our 2025 shared tasks have launched! Find details and data on our website, with evaluation data to be released April 1!
iwslt.org/2025/#shared...

We will be highlighting one task per day here and the other site. Join us for an exciting year of speech translation!!

IWSLT 2025

Home of the IWSLT conference and SIGSLT.

iwslt.org

2 3

Marine Carpuat @marinecarpuat.bsky.social · Jan 23

I'll be Germany next week to visit TUM Heilbronn and LMU Munich. Looking forward to learning from NLP researchers there and sharing recent work on human centered-machine translation! (And to discovering how much German I can actually understand after 2 weeks on duolingo 😅)

1 7

Reposted by Marine Carpuat

Gabriele Sarti @gsarti.com · Dec 10

Our piece is finally out in the Imminent blog! 🎉It presents preliminary findings of our recent study evaluating the usefulness of word-level quality estimation in real-world post-editing settings (paper forthcoming)! 🧵1/

imminent.translated.com/can-word-lev...

Can Word-level Quality Estimation Inform and Improve Machine Translation Post-editing? - Imminent - Translated's Research Center

Can Word-level Quality Estimation Inform and Improve Machine Translation Post-editing? - % Imminent is Translated’s Research Center which supports companies in localization, funds language data resear...

imminent.translated.com

1 9 26

Marine Carpuat @marinecarpuat.bsky.social · Dec 11

ICYMI: the UMD LSC is looking for a postdoctoral fellow with an interdisciplinary research agenda in language sciences.
languagescience.umd.edu/news/job-opp...

If your interests connect to #NLP research that helps people communicate across languages, please reach out!

1 1 3

Marine Carpuat @marinecarpuat.bsky.social · Dec 10

Interesting to see how Le Monde uses AI: MT (English articles via DeepL + postediting!), TTS, video captioning and translation, proofreading, and experimenting with rewriting content from news agency to their style specs www.lemonde.fr/le-monde-et-...

De quelles façons « Le Monde » se sert-il de l’IA ?

Conformément à ses engagements, « Le Monde » publie une liste exhaustive de l’usage par sa rédaction d’outils d’assistance éditoriale relevant de l’intelligence artificielle générative.

www.lemonde.fr

1 7

Reposted by Marine Carpuat

Hellina Hailu Nigatu @hellinanigatu.bsky.social · Dec 2

I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:

First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..

Dr Abeba Birhane @abeba.bsky.social · Nov 26

this is a green flag for openai & meta to formally be arbitrators of our languages & mass exploit the population (& researcher that've poured their souls into low resource languages),all to throw unreliable AI that has so far proven to result in more harm than benefit
www.reuters.com/technology/a...

Orange enlists Meta and OpenAI to develop AI language models in Africa

Orange will enlist OpenAI and Meta to fine-tune AI large language models (LLMs) to translate regional African languages for the French telecoms operator, it said on Tuesday.

www.reuters.com

9 62 130

Marine Carpuat @marinecarpuat.bsky.social · Nov 27

Generating new English terms would also be interesting! The paper looks at translating existing English terms. A fundamental challenge for LLMs is that some of the new terms are rare or even unseen in a Common Crawl corpus, but yes, there is lots of potential for LLMs as discovery tools.

1