Lightnews — Scholar-powered news

Janet Liu

@janetlauyeung.bsky.social

730 followers 150 following 20 posts

🏫 asst. prof. of compling at university of pittsburgh past: 🛎️ postdoc @mainlp.bsky.social, LMU Munich 🤠 PhD in CompLing from Georgetown 🕺🏻 x2 intern @Spotify @SpotifyResearch https://janetlauyeung.github.io/

janetlauyeung.github.io

Posts Media Videos Starter Packs

Pinned

Janet Liu @janetlauyeung.bsky.social · Jul 10

🦙 how well do LLMs encode discourse knowledge? does that generalize across languages?

🛎️ in our #ACL2025 paper, we uncover fascinating trends about multilingual discourse representations!

joint work w/ @florian-eichin.com @barbaraplank.bsky.social @mhedderich.bsky.social

📄 arxiv.org/abs/2503.10515

1 3 16

Janet Liu @janetlauyeung.bsky.social · 26d

🤚🏼 co-organizing a workshop on the 10th!

Reposted by Janet Liu

Verena Blaschke @verenablaschke.bsky.social · Aug 7

At #Interspeech2025 I'm going to present Betthupferl, a dataset for German dialect ASR & dialect-to-standard speech translation! We analyze differences between dialectal & Standard German transcriptions, benchmark ASR models, and examine shortcomings of current ASR models & evaluation metrics.

Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

1 4 16

Reposted by Janet Liu

MaiNLP lab, LMU Munich @mainlp.bsky.social · Jul 27

Unsure which presentations to attend at #ACL2025? 🛎️🗣️

MaiNLP lab, LMU Munich @mainlp.bsky.social · Jul 23

Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!

2 4

Janet Liu @janetlauyeung.bsky.social · Jul 23

🕺🏼swing by our poster in Hall 4/5 on Wednesday, July 30 at 11:00 to chat with @florian-eichin.com and I to find out the answers to these questions

🛎️ bonus: to see the full poster 🫣🧩

#ACL2025 #NLProc

part of the poster presentation at ACL 2025

1 3

Reposted by Janet Liu

Florian Eichin @florian-eichin.com · Jul 23

Some recommendations for #ACL2025 👇
(join me and @janetlauyeung.bsky.social to talk about discourse generalization and probing!)

MaiNLP lab, LMU Munich @mainlp.bsky.social · Jul 23

Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!

1 3

Reposted by Janet Liu

MaiNLP lab, LMU Munich @mainlp.bsky.social · Jul 23

Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!

1 5 14

Janet Liu @janetlauyeung.bsky.social · Jul 10

💡 more findings, error analysis, and in-depth discussion are in our paper:

📄 arxiv.org/abs/2503.10515
🤖 github.com/mainlp/disco...

meet and chat with us at our poster in Vienna 🇦🇹 at #ACL2025NLP

🕰️ 11:00-12:30, Wednesday, July 30
📍 Hall 4/5 Session 12: IP-Posters

Janet Liu @janetlauyeung.bsky.social · Jul 10

🔍 finding 3: discourse representations are best aligned across languages in the intermediate layers

Layer-wise probe performance by languages. Mean accuracy over five runs.

1 1

Janet Liu @janetlauyeung.bsky.social · Jul 10

🌍 finding 2: our probes generalize across languages and language families

Mean accuracy over five runs of the Aya-23-35B-probe trained and tested on various partitions of DISRPT.

1 1

Janet Liu @janetlauyeung.bsky.social · Jul 10

📌 finding 1: model size alone does not lead to discourse probing success; instead, multilingual training, dataset composition, and language-specific factors play significant roles

Mean accuracy over five runs of the probing classifiers trained on the entire DISRPT and full attention representations. The reference system DisCoDisCo achieved a mean accuracy of 47.9% (the red dashed line).

1 1

Janet Liu @janetlauyeung.bsky.social · Jul 10

🧪 for 23 SOTA LLMs, we use a probing approach to test whether their representations encode information relevant to discourse relation classification on DISRPT 2023, which covers 13 languages, four frameworks, 26 datasets, and various genres, domains, and modalities

1 2

Janet Liu @janetlauyeung.bsky.social · Jul 10

❓problem: discourse relations are central to NLU, but current work is primarily fragmented across frameworks & languages

🔧 solution: we proposed a unified label set of 17 relations across 4 discourse frameworks. This lets us compare model behavior across corpora, languages, and annotation schemes

Examples of the core discourse relation CONDITION (Bunt and Prasad, 2016) annotated in different frameworks and languages using different labels.

the proposed unified label set (see definitions and examples in the appendix of the paper)

1 1

Janet Liu @janetlauyeung.bsky.social · Jul 10

1 3 16

Reposted by Janet Liu

Leonie Weissweiler @weissweiler.bsky.social · Jun 11

I'm looking for a reviewer for a paper on measuring syntactic productivity (lots of maths!) due a week from now. Please shoot me an email if you could review!

Reposted by Janet Liu

Verena Blaschke @verenablaschke.bsky.social · May 30

Bavarian dialect speakers needed! Our MSc student Miriam wants to find out 1. how good/bad LLM-generated "Bavarian" is, and 2. whether dialect speakers agree with each other on this. The survey takes <5 min: survey.ifkw.lmu.de/dialquali25/ Thank you for sharing/participating!

3 3

Janet Liu @janetlauyeung.bsky.social · May 16

@munichcenterml.bsky.social
@slds-lmu.bsky.social
@munichcenterml.bsky.social
@berd-nfdi.bsky.social

Janet Liu @janetlauyeung.bsky.social · May 16

my amazing co-organizers: @assenmacher.bsky.social Jacob Beck, @barbaraplank.bsky.social , @stephnie.bsky.social, Frauke Kreuter, Gina Walejko

1 1

Janet Liu @janetlauyeung.bsky.social · May 16

🛎️ Excited to announce the 1st Workshop on Bridging NLP and Public Opinion Research at COLM 2025, Oct 10th in Montreal 🇨🇦

As LLMs reshape public discourse and research, collaboration between NLP and Public Opinion Research (POR) is more vital than ever #NLPOR Submit by June 23📄

🔗 tinyurl.com/nlpor25

Welcome to the First Workshop on Bridging NLP and Public Opinion Research, co-located with COLM 2025, October 10, 2025, Montreal, Canada.

1 10 18

Reposted by Janet Liu

BlackboxNLP @blackboxnlp.bsky.social · May 15

BlackboxNLP, the leading workshop on interpretability and analysis of language models, will be co-located with EMNLP 2025 in Suzhou this November! 📆

This edition will feature a new shared task on circuits/causal variable localization in LMs, details here: blackboxnlp.github.io/2025/task

3 8 21

Reposted by Janet Liu

Verena Blaschke @verenablaschke.bsky.social · Apr 29

On my way to #NAACL2025 where I'll give a keynote at the noisy text workshop (WNUT), presenting some of the challenges & methods for dialect NLP + also discussing dialect speakers' perspectives!

🗨️ Beyond “noisy” text: How (and why) to process dialect data
🗓️ Saturday, May 3, 9:30–10:30

1 7 27

Reposted by Janet Liu

Aaron Mueller @amuuueller.bsky.social · Apr 23

Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work?

We propose 😎 𝗠𝗜𝗕: a 𝗠echanistic 𝗜nterpretability 𝗕enchmark!

1 15 49

Reposted by Janet Liu

MaiNLP lab, LMU Munich @mainlp.bsky.social · Apr 1

🎉MaiNLP is turning 3 today!🎂🥳 We’ve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Here’s to many more years of exciting research!🚀

The hand-drawn sign from three years ago.

1 9 19

Janet Liu @janetlauyeung.bsky.social · Mar 28

🎯

Janet Liu @janetlauyeung.bsky.social · Mar 26

this 🎯🎯

Janet Liu @janetlauyeung.bsky.social · Mar 24

www.tiktok.com/@thedailysho...

Welcome back, astronauts! A LOT has changed since you left... #DailyShow #NASA #Trump #DesiLydic

TikTok video by The Daily Show