Lightnews — Scholar-powered news

Cesare

@cesare-spinoso.bsky.social

25 followers 30 following 7 posts

Hello! I'm Cesare (pronounced Chez-array). I'm a PhD student at McGill/Mila working in NLP/computational pragmatics. @mcgill-nlp.bsky.social @mila-quebec.bsky.social https://cesare-spinoso.github.io/

cesare-spinoso.github.io

Posts Media Videos Starter Packs

Reposted by Cesare

Abdelrahman Zayed @abdelzayed.bsky.social · Jul 25

A new paper accepted in @colmweb.org COLM 2025! I led a group of 3 brilliant students to dive deep into the problem of discrimination in language models. We discovered that models that take racist decisions don’t always have biased thoughts!

Reposted by Cesare

Gaurav Kamath @grvkamath.bsky.social · Jul 29

Our new paper in #PNAS (bit.ly/4fcWfma) presents a surprising finding—when words change meaning, older speakers rapidly adopt the new usage; inter-generational differences are often minor.

w/ Michelle Yang, ‪@sivareddyg.bsky.social‬ , @msonderegger.bsky.social‬ and @dallascard.bsky.social‬👇(1/12)

Reposted by Cesare

Ziling Cheng @ziling-cheng.bsky.social · Jul 28

What do systematic hallucinations in LLMs tell us about their generalization abilities?

Come to our poster at #ACL2025 on July 29th at 4 PM in Level 0, Halls X4/X5. Would love to chat about interpretability, hallucinations, and reasoning :)

@mcgill-nlp.bsky.social @mila-quebec.bsky.social

Cesare @cesare-spinoso.bsky.social · Jul 28

How can we use models of cognition to help LLMs interpret figurative language (irony, hyperbole) in a more human-like manner? Come to our #ACL2025NLP poster on Wednesday at 11AM (exhibit hall - exact location TBA) to find out! @mcgill-nlp.bsky.social @mila-quebec.bsky.social @aclmeeting.bsky.social

Cesare @cesare-spinoso.bsky.social · Jun 26

Thanks to collaborators David Austin, Pablo Piantanida and Jackie Cheung. We also received some amazing feedback from the @mila-quebec.bsky.social @mcgill-nlp.bsky.social community! And thanks to Jennifer Hu, Justine Kao and Polina Tsvilodub for sharing their datasets.

Cesare @cesare-spinoso.bsky.social · Jun 26

Other cool findings:
1. We prove that (RSA)^2 is more expressive than QUD-based RSA.
2. Naively applying RSA to LLMs leads to probability 𝘴𝘱𝘳𝘦𝘢𝘥𝘪𝘯𝘨, not 𝘯𝘢𝘳𝘳𝘰𝘸𝘪𝘯𝘨! Are there better ways to use RSA with LLMs?
3. What if we don't know the rhetorical strategies? We develop a clustering algorithm too!

Cesare @cesare-spinoso.bsky.social · Jun 26

What about LLMs? We integrate LLMs within (RSA)^2 and test them on a new dataset, PragMega+. We show that LLMs augmented with (RSA)^2 produce probability distributions which are more aligned with human expectations.

Cesare @cesare-spinoso.bsky.social · Jun 26

We test (RSA)^2 on two existing figurative language datasets: hyperbolic number expressions (e.g. “This kettle costs 1000$”) and ironic utterances about the weather (e.g. “The weather is amazing” during a Montreal blizzard). We obtain meaning distributions which are compatible with those of humans!

Cesare @cesare-spinoso.bsky.social · Jun 26

We develop (RSA)^2: a 𝘳𝘩𝘦𝘵𝘰𝘳𝘪𝘤𝘢𝘭-𝘴𝘵𝘳𝘢𝘵𝘦𝘨𝘺-𝘢𝘸𝘢𝘳𝘦 probabilistic framework of figurative language. In (RSA)^2 one listener will interpret language literally, another will interpret language ironically, etc. These listeners are marginalized to produce a distribution over possible meanings.

Cesare @cesare-spinoso.bsky.social · Jun 26

A blizzard is raging through Montreal when your friend says “Looks like Florida out there!” Humans easily interpret irony, while LLMs struggle with it. We propose a 𝘳𝘩𝘦𝘵𝘰𝘳𝘪𝘤𝘢𝘭-𝘴𝘵𝘳𝘢𝘵𝘦𝘨𝘺-𝘢𝘸𝘢𝘳𝘦 probabilistic framework as a solution.
Paper: arxiv.org/abs/2506.09301 to appear @ #ACL2025 (Main)

Reposted by Cesare

Benno Krojer @bennokrojer.bsky.social · Jun 25

Started a new podcast with @tomvergara.bsky.social !

Behind the Research of AI:
We look behind the scenes, beyond the polished papers 🧐🧪

If this sounds fun, check out our first "official" episode with the awesome Gauthier Gidel
from @mila-quebec.bsky.social :

open.spotify.com/episode/7oTc...

02 | Gauthier Gidel: Bridging Theory and Deep Learning, Vibes at Mila, and the Effects of AI on Art

Behind the Research of AI · Episode

open.spotify.com

Reposted by Cesare

Xing Han Lu @xhluca.bsky.social · Jun 14

"Build the web for agents, not agents for the web"

This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).

arxiv.org/abs/2506.10953

Reposted by Cesare

Badr M. Abdullah, PhD @badralabsi.bsky.social · Jun 10

New paper in Interspeech 2025 🚨
@interspeech.bsky.social

A Robust Model for Arabic Dialect Identification using Voice Conversion

Paper 📝 arxiv.org/pdf/2505.24713
Demo 🎙️https://shorturl.at/rrMm6

#Arabic #SpeechTech #NLProc #AI #Speech #ArabicDialects #Interspeech2025 #ArabicNLP

Reposted by Cesare

Ziling Cheng @ziling-cheng.bsky.social · Jun 6

Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n

Reposted by Cesare

Mila - Institut québécois d'IA @mila-quebec.bsky.social · May 1

Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670

Reposted by Cesare

Siva Reddy @sivareddyg.bsky.social · May 1

Ada is an undergrad and will soon be looking for PhDs. Gaurav is a PhD student looking for intellectually stimulating internships/visiting positions. They did most of the work without much of my help. Highly recommend them. Please reach out to them if you have any positions.

Language Models Largely Exhibit Human-like Constituent Ordering Preferences

Though English sentences are typically inflexible vis-à-vis word order, constituents often show far more variability in ordering. One prominent theory presents the notion that constituent ordering is ...

Reposted by Cesare

Benno Krojer @bennokrojer.bsky.social · May 1

Great work from labmates on LLMs vs humans regarding linguistic preferences: You know when a sentence kind of feels off e.g. "I met at the park the man". So in what ways do LLMs follow these human intuitions?

Mila - Institut québécois d'IA @mila-quebec.bsky.social · May 1

Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670

Reposted by Cesare

Parishad BehnamGhader @parishadbehnam.bsky.social · Mar 12

Instruction-following retrievers can efficiently and accurately search for harmful and sensitive information on the internet! 🌐💣

Retrievers need to be aligned too! 🚨🚨🚨

Work done with the wonderful Nick and @sivareddyg.bsky.social

🔗 mcgill-nlp.github.io/malicious-ir/
Thread: 🧵👇

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

Parishad BehnamGhader, Nicholas Meade, Siva Reddy

mcgill-nlp.github.io

Reposted by Cesare

Siva Reddy @sivareddyg.bsky.social · Feb 21

How to Get Your LLM to Generate Challenging
Problems for Evaluation? 🤔 Check out our CHASE recipe. A highly relevant problem given that most human-curated datasets are crushed within days.

Arkil Patel @arkil.bsky.social · Feb 21

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨

Work w/ fantastic advisors Dima Bahdanau and @sivareddyg.bsky.social

Thread 🧵:

Reposted by Cesare

Fabian David Schmidt @fdschmidt.bsky.social · Feb 21

Introducing MVL-SIB, a massively multilingual vision-language benchmark for cross-modal topic matching in 205 languages!

🤔Tasks: Given images (sentences), select topically matching sentence (image).

Arxiv: arxiv.org/abs/2502.12852
HF: huggingface.co/datasets/Wue...

Details👇

Reposted by Cesare

sienna @hapylilacident.bsky.social · Feb 21

Y’all we won!!!!!!!!! 🇨🇦

Reposted by Cesare

Yu Lu Liu @liuyulu.bsky.social · Jan 22

The submission deadline is in less than a month! We welcome encore submissions, so consider submitting your work regardless of whether it's been accepted or not #chi2025 😉

Yu Lu Liu @liuyulu.bsky.social · Dec 16

Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.

The image includes a shortened call for participation that reads:
"We welcome participants who work on topics related to supporting human-centered evaluation and auditing of language models. Topics of interest include, but not limited to:
- Empirical understanding of stakeholders' needs and goals of LLM evaluation and auditing
- Human-centered evaluation and auditing methods for LLMs
- Tools, processes, and guidelines for LLM evaluation and auditing
- Discussion of regulatory measures and public policies for LLM auditing
- Ethics in LLM evaluation and auditing

Special Theme: Mind the Context. We invite authors to engage with specific contexts in LLM evaluation and auditing. This theme could involve various topics: the usage contexts of LLMs, the context of the evaluation/auditing itself, and more! The term ''context'' is purposefully left open for interpretation!

The image also includes pictures of workshop organizers, who are: Yu Lu Liu, Wesley Hanwen Deng, Michelle S. Lam, Motahhare Eslami, Juho Kim, Q. Vera Liao, Wei Xu, Jekaterina Novikova, and Ziang Xiao.

Reposted by Cesare

Yu Lu Liu @liuyulu.bsky.social · Dec 16

Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.

The image includes a shortened call for participation that reads:
"We welcome participants who work on topics related to supporting human-centered evaluation and auditing of language models. Topics of interest include, but not limited to:
- Empirical understanding of stakeholders' needs and goals of LLM evaluation and auditing
- Human-centered evaluation and auditing methods for LLMs
- Tools, processes, and guidelines for LLM evaluation and auditing
- Discussion of regulatory measures and public policies for LLM auditing
- Ethics in LLM evaluation and auditing

Special Theme: Mind the Context. We invite authors to engage with specific contexts in LLM evaluation and auditing. This theme could involve various topics: the usage contexts of LLMs, the context of the evaluation/auditing itself, and more! The term ''context'' is purposefully left open for interpretation!

The image also includes pictures of workshop organizers, who are: Yu Lu Liu, Wesley Hanwen Deng, Michelle S. Lam, Motahhare Eslami, Juho Kim, Q. Vera Liao, Wei Xu, Jekaterina Novikova, and Ziang Xiao.

Reposted by Cesare

McGill NLP @mcgill-nlp.bsky.social · Nov 24

It turns out we had even more papers at EMNLP!

Let's complete the list with three more🧵

McGill NLP @mcgill-nlp.bsky.social · Nov 23

Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami ☀️ 📜

From interpretability to bias/fairness and cultural understanding -> 🧵

Reposted by Cesare

McGill NLP @mcgill-nlp.bsky.social · Nov 23

Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami ☀️ 📜

From interpretability to bias/fairness and cultural understanding -> 🧵