Lightnews — Scholar-powered news

Maria Teleki

@mariateleki.bsky.social

Howdy 🤠 | PhD in CS @ Texas A&M
🎙️ #speech #AI #NLP #recsys
🐶 Apollo’s human | 🛶 Rowing to 1M meters
🌐 https://mariateleki.github.io/

Posts Replies Media Videos

Maria Teleki

@mariateleki.bsky.social

📄 buff.ly/S0DSZzt
⚽️ Xiangjue Dong (1st author), Cong Wang, Millenium Bismay, and James Caverlee
#NLP #NLPResearch #LLMs #GenAI #AI

buff.ly

November 11, 2025 at 4:38 PM

Maria Teleki

@mariateleki.bsky.social

🌟 To me, this work is super exciting because we take a totally different perspective: we show that ⬆️ diverse perspectives, ⬆️ system performance, so ⬆️ $$$ for a company! With this work, we argue that <<< 🚨 diverse perspectives are absolutely necessary >>> from an economic standpoint.

buff.ly

November 11, 2025 at 4:38 PM

Maria Teleki

@mariateleki.bsky.social

You always hear about the "bias-accuracy tradeoff," meaning that ⬇️ model bias, ⬇️ system performance, so ⬇️ $$$ for a company. So much of the conversation around bias and diversity has focused on how to incentivize companies to debias their models (e.g., through new legislation).

buff.ly

November 11, 2025 at 4:38 PM

Maria Teleki

@mariateleki.bsky.social

Choosing an ASR system isn’t one-size-fits-all — it depends on the disfluencies in your domain.

📄 www.isca-archive.org/interspeech_...

www.isca-archive.org

November 3, 2025 at 6:02 PM

Maria Teleki

@mariateleki.bsky.social

I’m working on methods & evaluation frameworks for conversational AI that are:
✅ Robust to disfluencies
✅ Reliable in noisy, real-world conditions
✅ Generalizable across contexts

If conversational AI is going to truly work for everyone, it must be built for human speech as it is.

October 20, 2025 at 6:29 PM

Maria Teleki

@mariateleki.bsky.social

These insights still apply to anyone working on conversational AI, spoken summarization, or voice-driven interfaces today.

📄 Read more: www.isca-archive.org/interspeech_...

#SpeechProcessing #ConversationalAI #VoiceAI #Disfluency #SpokenLanguage
#INTERSPEECH

www.isca-archive.org

October 15, 2025 at 5:05 PM

Maria Teleki

@mariateleki.bsky.social

We compared two systems on 82,000+ podcast episodes. We found:
👉 WhisperX better captures interjections like “uh” and “um”
👉 Google ASR better captures edited nodes (e.g., “let’s go to Target--Walmart”)

🌟 The type of disfluency matters when choosing an ASR system

October 15, 2025 at 5:05 PM

Maria Teleki

@mariateleki.bsky.social

📄 Paper: mariateleki.github.io/pdf/A_Survey... | 💻 GitHub: github.com/mariateleki/...

#StoryGeneration #GenerativeAI #NLProc

September 27, 2025 at 6:15 PM

Maria Teleki

@mariateleki.bsky.social

#arxiv #speechAI #LLM

September 25, 2025 at 6:25 PM

Maria Teleki

@mariateleki.bsky.social

🧭 9 actionable recommendations for deployment -- some are surprising! 😱
📄 Paper: arxiv.org/pdf/2509.20321
💻 Code: github.com/mariateleki/...

arxiv.org

September 25, 2025 at 5:40 PM

Maria Teleki

@mariateleki.bsky.social

🔬 DRES provides the first controlled benchmark for evaluating LLMs on disfluency removal.
✅ Controlled evaluation on gold transcripts (no ASR noise) sets an upper bound
📊 Systematic comparison across open & proprietary LLMs
🧪 First taxonomy of LLM error modes

arxiv.org

September 25, 2025 at 5:40 PM

Maria Teleki

@mariateleki.bsky.social

🔎 Z-Scores reveal model weaknesses by disfluency type — EDITED, INTJ, and PRN — providing diagnostic insights that guide targeted improvements.

📄 Paper: arxiv.org/abs/2509.20319
💻 Code: github.com/mariateleki/...

Z-Scores: A Metric for Linguistically Assessing Disfluency Removal

Evaluating disfluency removal in speech requires more than aggregate token-level scores. Traditional word-based metrics such as precision, recall, and F1 (E-Scores) capture overall performance but…

arxiv.org

September 25, 2025 at 5:22 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news