Lightnews — Scholar-powered news

Raphaël Merx @rapha.dev · 3d

kudos to whoever came up with that paper name 👌

1

Raphaël Merx @rapha.dev · Jul 27

paper: aclanthology.org/2025.acl-dem...
demo: youtu.be/fQFwOxzR4MI

Tulun: Transparent and Adaptable Low-resource Machine Translation

Raphael Merx, Hanna Suominen, Lois Yinghui Hong, Nick Thieberger, Trevor Cohn, Ekaterina Vylomova. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: Sy...

aclanthology.org

Raphaël Merx @rapha.dev · Jul 27

in Vienna for ACL, presenting Tulun, a system for low-resource in-domain translation, using LLMs
Tuesday @ 4pm

Working w 2 real use cases: medical translation into Tetun 🇹🇱 & disaster relief speech translation in Bislama 🇻🇺

1 1 2

Raphaël Merx @rapha.dev · Jun 8

Cool paper, at the intersection of grammar and LLM interpretability.

I like that they use linguistic datasets for their experiments, then get results that can contribute to linguistics as a field too! (on structural priming vs L1/L2)

Catherine Arnett @ 🍁COLM🍁 @catherinearnett.bsky.social · Jun 5

My paper with @tylerachang.bsky.social and @jamichaelov.bsky.social will appear at #ACL2025NLP! The updated preprint is available on arxiv. I look forward to chatting about bilingual models in Vienna!

Catherine Arnett @ 🍁COLM🍁 @catherinearnett.bsky.social · Mar 7

✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.

1

Raphaël Merx @rapha.dev · May 26

Thanks a lot! I didn't make it to Albuquerque unfortunately, but I hope to be in Vienna for ACL. Might see you there?

1

Raphaël Merx @rapha.dev · May 25

Many thanks to Adérito Correia (Timor-Leste INL), and my supervisors Hanna Suominen Katerina Vylomova!

Paper at aclanthology.org/2025.loresmt... , video presentation at youtu.be/8zenieJWRyg

Low-resource Machine Translation: what for? who for? An observational study on a dedicated Tetun language translation service

Raphael Merx, Adérito José Guterres Correia, Hanna Suominen, Ekaterina Vylomova. Proceedings of the Eighth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2025). 20...

aclanthology.org

1

Raphaël Merx @rapha.dev · May 25

(3) The vast majority of usage is on mobile (over 90% of users / over 80k devices)

Takeaway: publishing MT model in mobile apps is probably more impactful than setting up a website / HuggingFace space.

1 2

Raphaël Merx @rapha.dev · May 25

(2) Translation into Tetun is in higher demand (by >2x) than translation from Tetun

Takeaway for us MT folks: focus on translation into low-res langs, harder but more impactful

1

Raphaël Merx @rapha.dev · May 25

We find that
(1) a LOT of usage is for educational purposes (>50% of translated text)
--> contrasts sharply with Tetun corpora (e.g. MADLAD), dominated by news & religion.

Takeaway: don't evaluate MT on overrepresented domains (e.g. religion)! You risk misrepresenting end-user exp.

1

Raphaël Merx @rapha.dev · May 25

Our paper on who uses tetun.org, and what for, got published at the LoResMT 2025 workshop! An emotional paper for me, going back to the project that got me into a machine learning PhD in the first place.

2 3

Raphaël Merx @rapha.dev · May 13

Very interesting findings, particularly the benefit (or lack thereof) of test-time scaling across domains

Raphaël Merx @rapha.dev · May 8

My favourite ICLR paper so far. Methodology, findings and their implications are all very cool.

In particular Fig. 2 + this discussion point:

1 3

Raphaël Merx @rapha.dev · May 2

Incredible paper, finding that large companies can game the LMArena through statistical noise (via many model submissions), over-sampling of their models, and overfitting to Arena-style prompts (without real gains on model reasoning)

The experiments they run to show this are pretty cool too!

Sara Hooker @sarahooker.bsky.social · Apr 30

It is critical for scientific integrity that we trust our measure of progress.

The @lmarena.bsky.social has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on the Arena, despite best intentions.

4

Raphaël Merx @rapha.dev · Apr 23

Cool summary of issues with multilingual LLM eval, and potential solutions!

If you're doubtful of all these non-reproducible evals on translated multiple choice questions, this paper is for you

Julia Kreutzer @juliakreutzer.bsky.social · Apr 17

📖New preprint with Eleftheria Briakou @swetaagrawal.bsky.social @mziizm.bsky.social @kocmitom.bsky.social!

arxiv.org/abs/2504.11829

🌍It reflects experiences from my personal research journey: coming from MT into multilingual LLM research I missed reliable evaluations and evaluation research…

Screenshot of the paper header with title and author list and affiliations

1 2

Raphaël Merx @rapha.dev · Apr 11

GlotEval - a unified framework for multilingual eval of LLMs, on 7 different tasks, by @tiedeman.bsky.social @helsinki-nlp.bsky.social

Just wish it supported eval of closed models (e.g. through LiteLLM?)

github.com/MaLA-LM/Glot...

GitHub - MaLA-LM/GlotEval: GlotEval: a unified evaluation toolkit designed to benchmark Large Language Models (LLMs) in a language-specific way

GlotEval: a unified evaluation toolkit designed to benchmark Large Language Models (LLMs) in a language-specific way - MaLA-LM/GlotEval

github.com

1

Reposted by Raphaël Merx

PyCon AU @pyconau.bsky.social · Mar 30

👋 Hey Bluesky!

We’ve just touched down and we’re excited to be here 🌤️🐍

This is the official PyCon AU account, your go-to space for updates, announcements, and all things Python in Australia✨

Hit that follow button and stay tuned because we’ve got some awesome things coming your way!

#PyConAU

PyConAU We are on BlueSky! Follow us and stay tuned! @pyconau.bsky.social

7 6

Raphaël Merx @rapha.dev · Mar 31

AI dev tools. In particular agents: are they hype or useful or both?

Raphaël Merx @rapha.dev · Mar 26

Perceptricon

1

Raphaël Merx @rapha.dev · Mar 17

The right thing to do, thanks for this *SEM

2

Raphaël Merx @rapha.dev · Feb 20

Super impactful, thank you for this! A natural sequel of Gatitos.

I'm esp. fond of your "researcher in the loop" method to ensure wide vocab coverage.

1

Reposted by Raphaël Merx

iseeaswell.bsky.social @iseeaswell.bsky.social · Feb 19

😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: arxiv.org/pdf/2502.12301
Huggingface: huggingface.co/datasets/goo...

2 8 14

Reposted by Raphaël Merx

Kris 🎃 Lorischild @direkris.itch.io · Jan 15

Been hearing a lot about recency bias lately. Must be pretty important

27 120

Raphaël Merx @rapha.dev · Feb 17

Such a well put together video! Gherkins in the background got a supporting role

1 1

Raphaël Merx @rapha.dev · Jan 24

Congrats! I'm just getting started but really liked your papers. Cool, impactful and well-written

1

Raphaël Merx @rapha.dev · Dec 5

Our paper on generating bilingual example sentences with LLMs got best paper award @ ALTA in Canberra!

arxiv.org/abs/2410.03182

We work with French / Indonesian / Tetun, find that annotators don't agree about what's a "good example", but that LLMs can align with a specific annotator.

1