Raphaël Merx
@rapha.dev
46 followers 88 following 26 posts
PhD @ UniMelb NLP, with a healthy dose of MT Based in 🇮🇩, worked in 🇹🇱 🇵🇬 , from 🇫🇷
Posts Media Videos Starter Packs
rapha.dev
kudos to whoever came up with that paper name 👌
rapha.dev
in Vienna for ACL, presenting Tulun, a system for low-resource in-domain translation, using LLMs
Tuesday @ 4pm

Working w 2 real use cases: medical translation into Tetun 🇹🇱 & disaster relief speech translation in Bislama 🇻🇺
rapha.dev
Cool paper, at the intersection of grammar and LLM interpretability.

I like that they use linguistic datasets for their experiments, then get results that can contribute to linguistics as a field too! (on structural priming vs L1/L2)
catherinearnett.bsky.social
My paper with @tylerachang.bsky.social and @jamichaelov.bsky.social will appear at #ACL2025NLP! The updated preprint is available on arxiv. I look forward to chatting about bilingual models in Vienna!
catherinearnett.bsky.social
✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.
rapha.dev
Thanks a lot! I didn't make it to Albuquerque unfortunately, but I hope to be in Vienna for ACL. Might see you there?
rapha.dev
(3) The vast majority of usage is on mobile (over 90% of users / over 80k devices)

Takeaway: publishing MT model in mobile apps is probably more impactful than setting up a website / HuggingFace space.
rapha.dev
(2) Translation into Tetun is in higher demand (by >2x) than translation from Tetun

Takeaway for us MT folks: focus on translation into low-res langs, harder but more impactful
rapha.dev
We find that
(1) a LOT of usage is for educational purposes (>50% of translated text)
--> contrasts sharply with Tetun corpora (e.g. MADLAD), dominated by news & religion.

Takeaway: don't evaluate MT on overrepresented domains (e.g. religion)! You risk misrepresenting end-user exp.
rapha.dev
Our paper on who uses tetun.org, and what for, got published at the LoResMT 2025 workshop! An emotional paper for me, going back to the project that got me into a machine learning PhD in the first place.
rapha.dev
Very interesting findings, particularly the benefit (or lack thereof) of test-time scaling across domains
rapha.dev
My favourite ICLR paper so far. Methodology, findings and their implications are all very cool.

In particular Fig. 2 + this discussion point:
rapha.dev
Incredible paper, finding that large companies can game the LMArena through statistical noise (via many model submissions), over-sampling of their models, and overfitting to Arena-style prompts (without real gains on model reasoning)

The experiments they run to show this are pretty cool too!
sarahooker.bsky.social
It is critical for scientific integrity that we trust our measure of progress.

The @lmarena.bsky.social has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on the Arena, despite best intentions.
rapha.dev
Cool summary of issues with multilingual LLM eval, and potential solutions!

If you're doubtful of all these non-reproducible evals on translated multiple choice questions, this paper is for you
juliakreutzer.bsky.social
📖New preprint with Eleftheria Briakou @swetaagrawal.bsky.social @mziizm.bsky.social @kocmitom.bsky.social!

arxiv.org/abs/2504.11829

🌍It reflects experiences from my personal research journey: coming from MT into multilingual LLM research I missed reliable evaluations and evaluation research…
Screenshot of the paper header with title and author list and affiliations
Reposted by Raphaël Merx
pyconau.bsky.social
👋 Hey Bluesky!

We’ve just touched down and we’re excited to be here 🌤️🐍

This is the official PyCon AU account, your go-to space for updates, announcements, and all things Python in Australia✨

Hit that follow button and stay tuned because we’ve got some awesome things coming your way!

#PyConAU
PyConAU We are on BlueSky! Follow us and stay tuned! @pyconau.bsky.social
rapha.dev
AI dev tools. In particular agents: are they hype or useful or both?
rapha.dev
Perceptricon
rapha.dev
The right thing to do, thanks for this *SEM
rapha.dev
Super impactful, thank you for this! A natural sequel of Gatitos.

I'm esp. fond of your "researcher in the loop" method to ensure wide vocab coverage.
Reposted by Raphaël Merx
iseeaswell.bsky.social
😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: arxiv.org/pdf/2502.12301
Huggingface: huggingface.co/datasets/goo...
Reposted by Raphaël Merx
direkris.itch.io
Been hearing a lot about recency bias lately. Must be pretty important
rapha.dev
Such a well put together video! Gherkins in the background got a supporting role
rapha.dev
Congrats! I'm just getting started but really liked your papers. Cool, impactful and well-written
rapha.dev
Our paper on generating bilingual example sentences with LLMs got best paper award @ ALTA in Canberra!

arxiv.org/abs/2410.03182

We work with French / Indonesian / Tetun, find that annotators don't agree about what's a "good example", but that LLMs can align with a specific annotator.