Janet Liu
@janetlauyeung.bsky.social
730 followers 150 following 20 posts
🏫 asst. prof. of compling at university of pittsburgh past: 🛎️ postdoc @mainlp.bsky.social, LMU Munich 🤠 PhD in CompLing from Georgetown 🕺🏻 x2 intern @Spotify @SpotifyResearch https://janetlauyeung.github.io/
Posts Media Videos Starter Packs
Pinned
janetlauyeung.bsky.social
🦙 how well do LLMs encode discourse knowledge? does that generalize across languages?

🛎️ in our #ACL2025 paper, we uncover fascinating trends about multilingual discourse representations!

joint work w/ @florian-eichin.com @barbaraplank.bsky.social @mhedderich.bsky.social

📄 arxiv.org/abs/2503.10515
to appear at ACL2025
janetlauyeung.bsky.social
🤚🏼 co-organizing a workshop on the 10th!
Reposted by Janet Liu
verenablaschke.bsky.social
At #Interspeech2025 I'm going to present Betthupferl, a dataset for German dialect ASR & dialect-to-standard speech translation! We analyze differences between dialectal & Standard German transcriptions, benchmark ASR models, and examine shortcomings of current ASR models & evaluation metrics.
Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken
Reposted by Janet Liu
mainlp.bsky.social
Unsure which presentations to attend at #ACL2025? 🛎️🗣️
mainlp.bsky.social
Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!
janetlauyeung.bsky.social
🕺🏼swing by our poster in Hall 4/5 on Wednesday, July 30 at 11:00 to chat with @florian-eichin.com and I to find out the answers to these questions

🛎️ bonus: to see the full poster 🫣🧩

#ACL2025 #NLProc
part of the poster presentation at ACL 2025
Reposted by Janet Liu
florian-eichin.com
Some recommendations for #ACL2025 👇
(join me and @janetlauyeung.bsky.social to talk about discourse generalization and probing!)
mainlp.bsky.social
Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!
Reposted by Janet Liu
mainlp.bsky.social
Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!
janetlauyeung.bsky.social
💡 more findings, error analysis, and in-depth discussion are in our paper:

📄 arxiv.org/abs/2503.10515
🤖 github.com/mainlp/disco...

meet and chat with us at our poster in Vienna 🇦🇹 at #ACL2025NLP

🕰️ 11:00-12:30, Wednesday, July 30
📍 Hall 4/5 Session 12: IP-Posters
janetlauyeung.bsky.social
🔍 finding 3: discourse representations are best aligned across languages in the intermediate layers
Layer-wise probe performance by languages. Mean accuracy over five runs.
janetlauyeung.bsky.social
🌍 finding 2: our probes generalize across languages and language families
Mean accuracy over five runs of the Aya-23-35B-probe trained and tested on various partitions of DISRPT.
janetlauyeung.bsky.social
📌 finding 1: model size alone does not lead to discourse probing success; instead, multilingual training, dataset composition, and language-specific factors play significant roles
Mean accuracy over five runs of the probing classifiers trained on the entire DISRPT and full attention representations. The reference system DisCoDisCo achieved a mean accuracy of 47.9% (the red dashed line).
janetlauyeung.bsky.social
🧪 for 23 SOTA LLMs, we use a probing approach to test whether their representations encode information relevant to discourse relation classification on DISRPT 2023, which covers 13 languages, four frameworks, 26 datasets, and various genres, domains, and modalities
janetlauyeung.bsky.social
❓problem: discourse relations are central to NLU, but current work is primarily fragmented across frameworks & languages

🔧 solution: we proposed a unified label set of 17 relations across 4 discourse frameworks. This lets us compare model behavior across corpora, languages, and annotation schemes
Examples of the core discourse relation CONDITION (Bunt and Prasad, 2016) annotated in different frameworks and languages using different labels. the proposed unified label set (see definitions and examples in the appendix of the paper)
janetlauyeung.bsky.social
🦙 how well do LLMs encode discourse knowledge? does that generalize across languages?

🛎️ in our #ACL2025 paper, we uncover fascinating trends about multilingual discourse representations!

joint work w/ @florian-eichin.com @barbaraplank.bsky.social @mhedderich.bsky.social

📄 arxiv.org/abs/2503.10515
to appear at ACL2025
Reposted by Janet Liu
weissweiler.bsky.social
I'm looking for a reviewer for a paper on measuring syntactic productivity (lots of maths!) due a week from now. Please shoot me an email if you could review!
Reposted by Janet Liu
verenablaschke.bsky.social
Bavarian dialect speakers needed! Our MSc student Miriam wants to find out 1. how good/bad LLM-generated "Bavarian" is, and 2. whether dialect speakers agree with each other on this. The survey takes <5 min: survey.ifkw.lmu.de/dialquali25/ Thank you for sharing/participating!
janetlauyeung.bsky.social
my amazing co-organizers: @assenmacher.bsky.social Jacob Beck, @barbaraplank.bsky.social , @stephnie.bsky.social, Frauke Kreuter, Gina Walejko
janetlauyeung.bsky.social
🛎️ Excited to announce the 1st Workshop on Bridging NLP and Public Opinion Research at COLM 2025, Oct 10th in Montreal 🇨🇦

As LLMs reshape public discourse and research, collaboration between NLP and Public Opinion Research (POR) is more vital than ever #NLPOR Submit by June 23📄

🔗 tinyurl.com/nlpor25
Welcome to the First Workshop on Bridging NLP and Public Opinion Research, co-located with COLM 2025, October 10, 2025, Montreal, Canada.
Reposted by Janet Liu
blackboxnlp.bsky.social
BlackboxNLP, the leading workshop on interpretability and analysis of language models, will be co-located with EMNLP 2025 in Suzhou this November! 📆

This edition will feature a new shared task on circuits/causal variable localization in LMs, details here: blackboxnlp.github.io/2025/task
Reposted by Janet Liu
verenablaschke.bsky.social
On my way to #NAACL2025 where I'll give a keynote at the noisy text workshop (WNUT), presenting some of the challenges & methods for dialect NLP + also discussing dialect speakers' perspectives!

🗨️ Beyond “noisy” text: How (and why) to process dialect data
🗓️ Saturday, May 3, 9:30–10:30
Reposted by Janet Liu
amuuueller.bsky.social
Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work?

We propose 😎 𝗠𝗜𝗕: a 𝗠echanistic 𝗜nterpretability 𝗕enchmark!
Logo for MIB: A Mechanistic Interpretability Benchmark
Reposted by Janet Liu
mainlp.bsky.social
🎉MaiNLP is turning 3 today!🎂🥳 We’ve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Here’s to many more years of exciting research!🚀
The hand-drawn sign from three years ago.