Maria Teleki
banner
mariateleki.bsky.social
Maria Teleki
@mariateleki.bsky.social
Howdy 🤠 | PhD in CS @ Texas A&M
🎙️ #speech #AI #NLP #recsys
🐶 Apollo’s human | 🛶 Rowing to 1M meters
🌐 https://mariateleki.github.io/
📄 buff.ly/S0DSZzt
⚽️ Xiangjue Dong (1st author), Cong Wang, Millenium Bismay, and James Caverlee
#NLP #NLPResearch #LLMs #GenAI #AI
buff.ly
November 11, 2025 at 4:38 PM
🌟 To me, this work is super exciting because we take a totally different perspective: we show that ⬆️ diverse perspectives, ⬆️ system performance, so ⬆️ $$$ for a company! With this work, we argue that <<< 🚨 diverse perspectives are absolutely necessary >>> from an economic standpoint.
buff.ly
November 11, 2025 at 4:38 PM
You always hear about the "bias-accuracy tradeoff," meaning that ⬇️ model bias, ⬇️ system performance, so ⬇️ $$$ for a company. So much of the conversation around bias and diversity has focused on how to incentivize companies to debias their models (e.g., through new legislation).
buff.ly
November 11, 2025 at 4:38 PM
Choosing an ASR system isn’t one-size-fits-all — it depends on the disfluencies in your domain.

📄 www.isca-archive.org/interspeech_...
www.isca-archive.org
November 3, 2025 at 6:02 PM
I’m working on methods & evaluation frameworks for conversational AI that are:
✅ Robust to disfluencies
✅ Reliable in noisy, real-world conditions
✅ Generalizable across contexts

If conversational AI is going to truly work for everyone, it must be built for human speech as it is.
October 20, 2025 at 6:29 PM
These insights still apply to anyone working on conversational AI, spoken summarization, or voice-driven interfaces today.

📄 Read more: www.isca-archive.org/interspeech_...

#SpeechProcessing #ConversationalAI #VoiceAI #Disfluency #SpokenLanguage
#INTERSPEECH
www.isca-archive.org
October 15, 2025 at 5:05 PM
We compared two systems on 82,000+ podcast episodes. We found:
👉 WhisperX better captures interjections like “uh” and “um”
👉 Google ASR better captures edited nodes (e.g., “let’s go to Target--Walmart”)

🌟 The type of disfluency matters when choosing an ASR system
October 15, 2025 at 5:05 PM
September 27, 2025 at 6:15 PM
September 25, 2025 at 6:25 PM
🧭 9 actionable recommendations for deployment -- some are surprising! 😱
📄 Paper: arxiv.org/pdf/2509.20321
💻 Code: github.com/mariateleki/...
arxiv.org
September 25, 2025 at 5:40 PM
🔬 DRES provides the first controlled benchmark for evaluating LLMs on disfluency removal.
✅ Controlled evaluation on gold transcripts (no ASR noise) sets an upper bound
📊 Systematic comparison across open & proprietary LLMs
🧪 First taxonomy of LLM error modes
arxiv.org
September 25, 2025 at 5:40 PM
🔎 Z-Scores reveal model weaknesses by disfluency type — EDITED, INTJ, and PRN — providing diagnostic insights that guide targeted improvements.

📄 Paper: arxiv.org/abs/2509.20319
💻 Code: github.com/mariateleki/...
Z-Scores: A Metric for Linguistically Assessing Disfluency Removal
Evaluating disfluency removal in speech requires more than aggregate token-level scores. Traditional word-based metrics such as precision, recall, and F1 (E-Scores) capture overall performance but…
arxiv.org
September 25, 2025 at 5:22 PM