Chantal
@chantalsh.bsky.social
50 followers 57 following 12 posts
PhD (in progress) @ Northeastern! NLP 🤝 LLMs she/her
Posts Media Videos Starter Packs
chantalsh.bsky.social
(7/7) For more details, please check out our pre-print!
chantalsh.bsky.social
(6/7) LLMs are terrible at detecting their own slop: GPT-5, Deepseek-V3, and o3-mini rarely assign a label of "slop" (avg. 6% of documents), whereas humans marked 34% of texts as "slop."
chantalsh.bsky.social
(5/7) We lack good/reliable automatic text metrics for 3 of the 5 most important slop features: relevance, coherence, and tone. :-(
chantalsh.bsky.social
(4/7) Different domains have different slop signatures. In news articles, coherence, density, relevance, and tone issues predict slop. In Q&A tasks, it's factuality and structure. Context matters!
chantalsh.bsky.social
(3/7) Humans can spot "sloppy text", but may have differing thresholds on overall assessments. But our annotators consistently flagged the same problematic passages, suggesting we know it when we see it...
chantalsh.bsky.social
(2/7) TL;DR: Measuring the construct of slop is difficult! While somewhat subjective and domain-dependent, it boils down to three key factors: information quality, density, and stylistic choices. We introduce a taxonomy for slop.
chantalsh.bsky.social
"AI slop" seems to be everywhere, but what exactly makes text feel like "slop"?

In our new work (w/ @tuhinchakr.bsky.social, Diego Garcia-Olano, @byron.bsky.social ) we provide a systematic attempt at measuring AI "slop" in text!

arxiv.org/abs/2509.19163

🧵 (1/7)
chantalsh.bsky.social
(5/7) We lack good/reliable automatic text metrics for 3 of the 5 most important slop features: relevance, coherence, and tone. :-(
chantalsh.bsky.social
(4/7) Different domains have different slop signatures. In news articles, coherence, density, relevance, and tone issues predict slop. In Q&A tasks, it's factuality and structure. Context matters!
chantalsh.bsky.social
(3/7) Humans can spot "sloppy text", but may have differing thresholds on overall assessments. But our annotators consistently flagged the same problematic passages, suggesting we know it when we see it...
chantalsh.bsky.social
(2/7) TL;DR: Measuring the construct of slop is difficult! While somewhat subjective and domain-dependent, it boils down to three key factors: information quality, density, and stylistic choices. We introduce a taxonomy for slop.
chantalsh.bsky.social
I'm searching for some comp/ling experts to provide a precise definition of “slop” as it refers to text (see: corp.oup.com/word-of-the-...)

I put together a google form that should take no longer than 10 minutes to complete: forms.gle/oWxsCScW3dJU...
If you can help, I'd appreciate your input! 🙏
Oxford Word of the Year 2024 - Oxford University Press
The Oxford Word of the Year 2024 is 'brain rot'. Discover more about the winner, our shortlist, and 20 years of words that reflect the world.
corp.oup.com