Paolo Papotti
banner
papotti.bsky.social
Paolo Papotti
@papotti.bsky.social
Associate Prof at EURECOM and 3IA Côte d'Azur Chair of Artificial Intelligence. ELLIS member.
Data management and NLP/LLMs for information quality.
https://www.eurecom.fr/~papotti/
We introduce
- Query planning as constrained optimization over quality constraints and cost objective
- Gradient-based optimization to jointly choose operators and allocate error budgets across pipelines
- KV-cache–based operators to turn discrete physical choices into a runtime-quality continuum
February 11, 2026 at 7:45 AM
Co-authors: Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig

This is the first outcome of our collaboration with Technische Universität Darmstadt within the @agencerecherche.bsky.social / @dfg.de ANR/DFG #Magiq project - more to come!
February 11, 2026 at 7:45 AM
Empirically, Stretto delivers 2x-10x faster execution 🔥 across various datasets and queries compared to prior systems that meet quality guarantees.
February 11, 2026 at 7:45 AM
Happy Fontaines D.C.'s fan from the last album (2024). But the real treat was discovering the previous ones!
February 1, 2026 at 8:22 PM
I d also like to test it, thanks!
January 23, 2026 at 7:46 AM
I agree. Here is another trick for input context we recently published
bsky.app/profile/papo...
🛑 𝐒𝐭𝐨𝐩 𝐭𝐡𝐫𝐨𝐰𝐢𝐧𝐠 𝐚𝐰𝐚𝐲 𝐲𝐨𝐮𝐫 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐬𝐜𝐨𝐫𝐞𝐬.
RAG uses embedding scores to pick Top-K, then treat all retrieved chunks as equal.
Parallel Context-of-Experts Decoding (PCED) uses retrieval scores to move evidence aggregation from attention to decoding.
🚀 180× faster time-to-first-token!
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation
Retrieval Augmented Generation faces a trade-off: concatenating documents in a long prompt enables multi-document reasoning but creates prefill bottlenecks, while encoding document KV caches separatel...
arxiv.org
January 19, 2026 at 12:20 PM
These results point toward models that decide which retrieved document to trust, turning “context engineering” from a static prompt recipe into a dynamic decoding policy.
Amazing work from Giulio Corallo in his industrial PhD at SAP!
January 15, 2026 at 7:37 AM
Key insight: 𝐄𝐯𝐢𝐝𝐞𝐧𝐜𝐞 𝐚𝐠𝐠𝐫𝐞𝐠𝐚𝐭𝐢𝐨𝐧 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐚𝐭 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 𝐭𝐢𝐦𝐞, the model can effectively “switch” which document drives each token - without cross-document attention!
January 15, 2026 at 7:37 AM
📈 Results: PCED often matches (and sometimes beats) long-context concatenation, while dramatically outperforming KV merge baseline on multi-doc QA/ICL.
🚀 Systems win: ~180× faster time-to-first-token vs long-context prefill using continuous batching and Paged Attention.
January 15, 2026 at 7:37 AM
Instead of concatenating docs into one context (slow, noisy attention), training-free PCED:
● Keeps each document as its own 𝐞𝐱𝐩𝐞𝐫𝐭 with independent KV cache
● Runs experts in 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥 to get logits
● Selects next token with a 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥-𝐚𝐰𝐚𝐫𝐞 𝐜𝐨𝐧𝐭𝐫𝐚𝐬𝐭𝐢𝐯𝐞 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 rule integrating scores as a prior
January 15, 2026 at 7:37 AM
Kudos to my amazing co-authors Dario Satriani, Enzo Veltri, Donatello Santoro! Another great collaboration between Università degli Studi della Basilicata and EURECOM 🙌

#LLM #Factuality #Benchmark #RelationalFactQA #NLP #AI
June 2, 2025 at 2:51 PM
Structured outputs power analytics, reporting, and tool-augmented agents. This work exposes where current LLMs fall short and offers a clear tool for measuring progress on factuality beyond single-value QA. 📊
June 2, 2025 at 2:51 PM
We release a new factuality benchmark with 696 annotated natural-language questions paired with gold factual answers expressed as tables (avg. 27 rows × 5 attributes), spanning 9 knowledge domains, with controlled question complexity and rich metadata.
June 2, 2025 at 2:51 PM
Our new paper, "RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models", measures exactly this gap.

Wider or longer output tables = tougher for all LLMs! 🧨
From Llama 3 and Qwen to GPT-4, no LLM goes above 25% accuracy on our stricter measure.
June 2, 2025 at 2:51 PM
and a special thanks to
@tanmoy-chak.bsky.social for leading this effort!
June 1, 2025 at 8:43 AM
It’s time we rethink how "facts" are negotiated in the age of platforms.

Excited to hear your thoughts!
#Misinformation #FactChecking #SocialMedia #Epistemology #HCI #DigitalTruth #CommunityNotes

arxiv.org/pdf/2505.20067
arxiv.org
June 1, 2025 at 7:48 AM
Community-based moderation offers speed & scale, but also raises tough questions:
– Can crowds overcome bias?
– What counts as evidence?
– Who holds epistemic authority?

Our interdisciplinary analysis combines perspectives from HCI, media studies, & digital governance.
June 1, 2025 at 7:48 AM
Platforms like X are outsourcing fact-checking to users via tools like Community Notes. But what does this mean for truth online?

We argue this isn’t just a technical shift — it’s an epistemological transformation. Who gets to define what's true when everyone is the fact-checker?
June 1, 2025 at 7:48 AM