Marine Carpuat
@marinecarpuat.bsky.social
3.1K followers 240 following 20 posts
Associate Professor in Computer Science at the University of Maryland. Human-Centered Natural Language Processing & Machine Translation
Posts Media Videos Starter Packs
marinecarpuat.bsky.social
I'm so happy to see this and so sad I missed the talk! The field would not be the same without Kathy in so many ways, from her research contributions to her generous mentorship of many of us beyond her advisees.
marinecarpuat.bsky.social
This is a big team effort with

Omri Asscher, Kalika Bali, @luisabentivogli.bsky.social, Frédéric Blain, @bowkerl.bsky.social‬, Monojit Choudhury, @haldaume3.bsky.social, Kevin Duh, Ge Gao, Alvin Grissom II‬, @markar.bsky.social‬, Elaine C. Khoong, ‪@wildlewis.bsky.social...
marinecarpuat.bsky.social
We argue for a human-centered approach: MT shouldn’t just produce correct outputs—they should support diverse users, goals, and contexts.

But let's not start from scratch: Translation Studies and HCI offer a wealth of theoretical and empirical work to rethink MT as a socio-technical problem.
marinecarpuat.bsky.social
What should Machine Translation research look like in the age of multilingual LLMs?

Here’s one answer from researchers across NLP/MT, Translation Studies, and HCI.
"An Interdisciplinary Approach to Human-Centered Machine Translation"
arxiv.org/abs/2506.13468
An Interdisciplinary Approach to Human-Centered Machine Translation
Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...
arxiv.org
marinecarpuat.bsky.social
Disagreement between LLMs can be a strength! @dayeonki.bsky.social shows that having multiple LLMs debate improves their answers to culturally variable social norm questions. #ACL2025
dayeonki.bsky.social
1/ Are two #LLMs better than one for equitable cultural alignment? 🌍

We introduce a Multi-Agent Debate framework — where two LLM agents debate the cultural adaptability of a given scenario.

#ACL2025 🧵👇
Reposted by Marine Carpuat
dayeonki.bsky.social
1/ How can a monolingual English speaker 🇺🇸 decide if an automatic French translation 🇫🇷 is good enough to be shared?

Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️

#ACL2025
marinecarpuat.bsky.social
Life around submission deadlines is much more sane for me when we have a group paper clinic early (2 weeks before the deadline). Sometimes we even have an "extended abstract" clinic 1 month earlier.

That said, I did not follow my own advice this cycle, and I am now in recovery mode too!
Reposted by Marine Carpuat
dayeonki.bsky.social
🚨 New Paper 🚨

1/ We often assume that well-written text is easier to translate ✏️

But can #LLMs automatically rewrite inputs to improve machine translation? 🌍

Here’s what we found 🧵
Reposted by Marine Carpuat
wissamantoun.bsky.social
ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:
Reposted by Marine Carpuat
wissamantoun.bsky.social
CamemBERT 2.0: A Smarter French 🇫🇷 Language Model Aged to Perfection 👌

We release a much-needed update for the previous. SOTA French encoder LM.

We introduce two new models CamemBERTa-v2 and CamemBERT-v2, based on the DeBERTaV3 and RoBERTa recipe.

So what's new?

[1/8]
Reposted by Marine Carpuat
inriaparisnlp.bsky.social
We are happy to announce our next seminar, given by Florian Cafiero @floriancafiero.bsky.social (PSL @ecoledeschartes.bsky.social) entitled "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" on Friday 7th March at 11am CET. Details here: t.co/pPbWfkALM4!
Florian Cafiero - "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" - ALMAnaCH seminar 7th March 2025 at 11am CET
Reposted by Marine Carpuat
rockpang.bsky.social
🤔 Interested in how #HCI thinks about using #LLMs, or looking to understand best practices for human-LLM interaction?

🚨🚨New paper: Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review 🧵
Reposted by Marine Carpuat
iwslt.bsky.social
First up, a new task for 2025:
*Instruction-following for speech processing!*

Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.

🔗: iwslt.org/2025/instruc...
Instruction-following Speech Processing track
Home of the IWSLT conference and SIGSLT.
iwslt.org
Reposted by Marine Carpuat
iwslt.bsky.social
We are pleased to announce that our 2025 shared tasks have launched! Find details and data on our website, with evaluation data to be released April 1!
iwslt.org/2025/#shared...

We will be highlighting one task per day here and the other site. Join us for an exciting year of speech translation!!
IWSLT 2025
Home of the IWSLT conference and SIGSLT.
iwslt.org
marinecarpuat.bsky.social
I'll be Germany next week to visit TUM Heilbronn and LMU Munich. Looking forward to learning from NLP researchers there and sharing recent work on human centered-machine translation! (And to discovering how much German I can actually understand after 2 weeks on duolingo 😅)
marinecarpuat.bsky.social
ICYMI: the UMD LSC is looking for a postdoctoral fellow with an interdisciplinary research agenda in language sciences.
languagescience.umd.edu/news/job-opp...

If your interests connect to #NLP research that helps people communicate across languages, please reach out!
Reposted by Marine Carpuat
hellinanigatu.bsky.social
I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:

First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..
abeba.bsky.social
this is a green flag for openai & meta to formally be arbitrators of our languages & mass exploit the population (& researcher that've poured their souls into low resource languages),all to throw unreliable AI that has so far proven to result in more harm than benefit
www.reuters.com/technology/a...
Orange enlists Meta and OpenAI to develop AI language models in Africa
Orange will enlist OpenAI and Meta to fine-tune AI large language models (LLMs) to translate regional African languages for the French telecoms operator, it said on Tuesday.
www.reuters.com
marinecarpuat.bsky.social
Generating new English terms would also be interesting! The paper looks at translating existing English terms. A fundamental challenge for LLMs is that some of the new terms are rare or even unseen in a Common Crawl corpus, but yes, there is lots of potential for LLMs as discovery tools.