Debora Nozza
deboranozza.bsky.social
Debora Nozza
@deboranozza.bsky.social
Assistant Professor at Bocconi University in MilaNLP group • Working in #NLP, #CSS and #Ethics • She/her • #ERCStG PERSONAE
Reposted by Debora Nozza
Found and added under data/
January 20, 2026 at 11:21 AM
Reposted by Debora Nozza
I included some test cases on GitHub, will look if I still have the ones we used in the paper.
January 20, 2026 at 11:11 AM
Reposted by Debora Nozza
If you are curious about the theoretical background, see

Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy E. (2013). Learning Whom to Trust With MACE. In: Proceedings of NAACL-HLT. ACL.

aclanthology.org/N13-1132.pdf

And for even more details:

aclanthology.org/Q18-1040.pdf

N/N
aclanthology.org
January 20, 2026 at 10:20 AM
Reposted by Debora Nozza
I always wanted to revisit it, port it from Java to Python & extend to continuous data, but never found the time.
Last week, I played around with Cursor – and got it all done in ~1 hour. 🤯

If you work with any response data that needs aggregation, give it a try—and let me know what you think!

4/N
January 20, 2026 at 10:17 AM
Reposted by Debora Nozza
MACE estimates:
1. Annotator reliability (who’s consistent?)
2. Item difficulty (which examples spark disagreement?)
3. The most likely aggregate label (the latent “best guess”)

That “side project” ended up powering hundreds of annotation projects over the years.

3/N
January 20, 2026 at 10:15 AM
Reposted by Debora Nozza
However, disagreement isn’t just noise—it’s information. It can mean an item is genuinely hard—or someone wasn’t paying attention. If only you knew whom to trust…

That summer, Taylor Berg-Kirkpatrick, Ashish Vaswani, and I built MACE (Multi-Annotator Competence Estimation).

2/N
January 20, 2026 at 10:14 AM
Reposted by Debora Nozza
🚨(Software) Update:

In my PhD, I had a side project to fix an annoying problem: when you ask 5 people to label the same thing, you often get different answers. But in ML (and lots of other analyses), you still need a single aggregated answer. Using the majority vote is easy–but often wrong.

1/N
GitHub - dirkhovy/MACE: Multi-Annotator Competence Estimation tool
Multi-Annotator Competence Estimation tool. Contribute to dirkhovy/MACE development by creating an account on GitHub.
github.com
January 20, 2026 at 10:12 AM
Reposted by Debora Nozza
The deadline is approaching! Join the team :)
⏳ Deadline approaching! We’re hiring 2 fully funded postdocs in #NLP.

Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)

🔗 Details + how to apply: milanlproc.github.io/open_positio...

⏰ Deadline: Jan 31, 2026
January 20, 2026 at 10:27 AM
Reposted by Debora Nozza
This week at reading group 📚
@pranav-nlp.bsky.social presented "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing.

Paper: arxiv.org/abs/2505.08127

#NLProc
January 23, 2026 at 1:35 PM
Reposted by Debora Nozza
Thank you @belindazli.bsky.social for the great talk "Solving the Specification Problem through Interaction” at our weekly seminar!

#NLProc
January 23, 2026 at 4:26 PM
Reposted by Debora Nozza
⏳ Deadline approaching! We’re hiring 2 fully funded postdocs in #NLP.

Join the MilaNLP team and contribute to our upcoming research projects (SALMON & TOLD)

🔗 Details + how to apply: milanlproc.github.io/open_positio...

⏰ Deadline: Jan 31, 2026
January 19, 2026 at 5:24 PM
Reposted by Debora Nozza
🎉 MilaNLP 2025 Wrapped 🎉
Lots of learning, building , sharing, and growing together 🌱

#NLProc
January 20, 2026 at 11:15 AM
Reposted by Debora Nozza
New year, new job? If that is your current mantra, check the open postdoc positions with Debora Nozza and me at our lab. Deadline is January 31st.

milanlproc.github.io/open_positio...
Postdoctoral Researcher – NLP (2 positions) | MilaNLP Lab @ Bocconi University
Two Postdoctoral Researcher positions – Deadline January 31st, 2026
milanlproc.github.io
January 19, 2026 at 4:13 PM
Reposted by Debora Nozza
We're also back with the lab's seminar! Today we had Eleonora Mancini presenting her doctoral research "Multimodal AI for Human Expression Understanding".

#NLP #multimodality #speech
January 16, 2026 at 4:47 PM
Reposted by Debora Nozza
Holidays over, reading group resumes 📖
Today Henning Hoffmann presented the paper "Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models"

Paper: arxiv.org/pdf/2502.07328

#NLProc
January 15, 2026 at 11:50 AM
Reposted by Debora Nozza
#MemoryModay #NLProc Countering Hateful and Offensive Speech Online - Open Challenges" by Plaza-Del-Arco, @debora_nozza, Guerini, Sorensen, Zampieri, 2024 is a tutorial on the challenges and solutions for detecting and mitigating hate speech.
Countering Hateful and Offensive Speech Online - Open Challenges
Flor Miriam Plaza-del-Arco, Debora Nozza, Marco Guerini, Jeffrey Sorensen, Marcos Zampieri. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts.…
aclanthology.org
December 22, 2025 at 4:03 PM
Reposted by Debora Nozza
For today's reading group, Serena Pugliese presented the paper “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models" by Piercosma Bisconti et al. (2025).

Paper: arxiv.org/pdf/2511.15304

#NLProc
#LLMs #jailbreaking
December 18, 2025 at 11:32 AM
Reposted by Debora Nozza
🚀 We’re opening 2 fully funded postdoc positions in #NLP!

Join the MilaNLP team and contribute to our upcoming research projects.

🔗 More details: milanlproc.github.io/open_positio...

⏰ Deadline: Jan 31, 2026
December 18, 2025 at 3:29 PM
Reposted by Debora Nozza
#TBT #NLProc #MachineLearning #SafetyFirst 'Safety-Tuned LLaMAs: Improving LLMs Safety' by Bianchi et al. explores training LLMs for safe refusals, warns of over-tuning.
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large...
Training large language models to follow instructions makes them perform better on a wide range of tasks and generally become more helpful. However, a perfectly helpful model will follow even the most...
arxiv.org
December 18, 2025 at 4:02 PM
Reposted by Debora Nozza
Huge thanks to our speakers at last Friday’s lab seminar!
🗣️ @penzo-nicolo.bsky.social on multi-party conversations
🌍 @patriciachiril.bsky.social on NLP for socially grounded research

#NLProc
December 16, 2025 at 3:43 PM
Reposted by Debora Nozza
#MemoryModay #NLProc Uma, A. N. et al. examine AI model training in 'Learning from Disagreement: A Survey'. Disagreement-handling methods' performance is shaped by evaluation methods & dataset traits.
jair.org
December 15, 2025 at 4:02 PM
Reposted by Debora Nozza
At today’s lab reading group @carolin-holtermann.bsky.social presented ‘Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs’ by @angelinawang.bsky.social et al. (2025).
Lots to think about how we evaluate fairness in language models!

#NLProc #fairness #LLMs
December 11, 2025 at 11:55 AM
Reposted by Debora Nozza
#TBT #NLProc 'Respectful or Toxic?' by Plaza-del-Arco, @debora & @dirkhovy.bsky.social (2023) explores zero-shot learning for multilingual hate speech detection. Highlights prompt & model choice for accuracy. #AI #LanguageModels #HateSpeechDetection
Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech
Flor Miriam Plaza-del-arco, Debora Nozza, Dirk Hovy. The 7th Workshop on Online Abuse and Harms (WOAH). 2023.
aclanthology.org
December 11, 2025 at 4:03 PM
Reposted by Debora Nozza
For our weekly reading group last week, @a-lauscher.bsky.social presented the paper “Shape it Up! Restoring LLM Safety during Finetuning" by ShengYun Peng et al. (2025)

#NLProc
December 9, 2025 at 10:24 AM
Reposted by Debora Nozza
#MemoryModay #NLProc 'Leveraging Social Interactions to Detect Misinformation on Social Media' by Fornaciari et al. (2023) uses combined text and network analysis to spot unreliable threads.
arxiv.org
December 8, 2025 at 4:03 PM