Manuel Gomez Rodriguez
@autreche.bsky.social
87 followers 74 following 9 posts
Human-Centric Machine Learning at the Max Planck Institute for Software Systems
Posts Media Videos Starter Packs
Reposted by Manuel Gomez Rodriguez
jesfrellsen.bsky.social
Decisions are out for EurIPS 2025 Workshops 🎉: eurips.cc/workshops

As workshop chairs with Yingzhen Li
@autreche.bsky.social and @witzner.bsky.social, we are excited to share the workshop program and look forward to seeing the community in December!
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
Last week I had the pleasure of presenting a 2.5-hour tutorial on "Counterfactuals in Minds and Machines" at UAI 2025 in Rio 🇧🇷, prepared together with @autreche.bsky.social and @tobigerstenberg.bsky.social. We've made all materials and references available here: learning.mpi-sws.org/counterfactu...
Reposted by Manuel Gomez Rodriguez
euripsconf.bsky.social
EurIPS includes a call for both Workshops and Affinity Workshops!
We look forward to making #EurIPS a diverse and inclusive event with you.

The submission deadlines are August 22nd, AoE.

More information at:
eurips.cc/call-for-wor...
eurips.cc/call-for-aff...
Reposted by Manuel Gomez Rodriguez
lasha.bsky.social
📣 Life update: Thrilled to announce that I’ll be starting as faculty at the Max Planck Institute for Software Systems this Fall!

I’ll be recruiting PhD students in the upcoming cycle, as well as research interns throughout the year: lasharavichander.github.io/contact.html
Kaiserslautern, Germany
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
In Athens 🇬🇷 for the Greeks in AI symposium. Super excited to present our work on "Counterfactual Token Generation in LLMs" (bit.ly/4nMibs2) and see all the amazing work Greek people all over the world are doing on AI! If you are in Athens, let's meet! Next, heading to👇
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
Heading to Rio de Janeiro 🇧🇷 for UAI 2025 (@auai.org) to present our tutorial with @tobigerstenberg.bsky.social and @autreche.bsky.social on "Counterfactuals in Minds and Machines" on Monday. Looking forward to this! If you are in Rio, let's meet!
Reposted by Manuel Gomez Rodriguez
ellis.eu
ELLIS @ellis.eu · Jul 16
📢 Present your NeurIPS paper in Europe!

Join EurIPS 2025 + ELLIS UnConference in Copenhagen for in-person talks, posters, workshops and more. Registration opens soon; save the date:

📅 Dec 2–7, 2025
📍 Copenhagen 🇩🇰
🔗eurips.cc

#EurIPS
@euripsconf.bsky.social
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
The LLM API you use returns (and charges you for) 5 tokens. Did the LLM actually generate 5 tokens? Or is the provider overcharging you? 🤔 In arxiv.org/abs/2505.21627, led by Ander Artola Velasco, we argue (game-theoretically) for a change from pay-per-token to pay-per-character.
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
In Singapore for #ICLR2025! I'll be presenting our work on a causal methodology for evaluating LLMs (arxiv.org/abs/2502.01754) at the "Building Trust in LLMs" workshop on Monday. If you are working on causality, game theory and/or LLMs, let's grab a ☕️ during the conference!
Reposted by Manuel Gomez Rodriguez
neuripsconf.bsky.social
The NeurIPS Call for Workshops is now live. Proposals are due May 30 AoE, with acceptance notification on July 4 AoE. neurips.cc/Conferences/...

If you plan to submit a proposal for a workshop, please read our detailed guidance in our new blog post: blog.neurips.cc/2025/04/12/g...
Call For Workshops 2025
Sat Dec 6 and Sun Dec 7, 2025
neurips.cc
Reposted by Manuel Gomez Rodriguez
mtoneva.bsky.social
Together with @autreche.bsky.social, Adish Singla, Krishna Gummadi, Goran Radanovic and Nina Grgić-Hlača, we have multiple open positions postdocs in AI, Computing, and Society at the MPI for Software Systems!

Apply by May 13 via the new Max Planck Postdoc Program!
www.mpg.de/en/max-planc...
Max Planck Postdoc Program
Launching in April 2025, the Max Planck Postdoc Program features structured application calls and a comprehensive support system for researchers.
www.mpg.de
autreche.bsky.social
In “AI for modelling infectious disease epidemics”, just published in Nature, we discuss how AI can help us in future pandemics.

This is joint work with many colleagues, led by Moritz U G Kraemer and Samir Bhatt!

www.nature.com/articles/s41...

You can read it for free here: t.co/s4YjjmTNOp
Artificial intelligence for modelling infectious disease epidemics - Nature
This Perspective considers the application to infectious disease modelling of AI systems that combine machine learning, computational statistics, information retrieval and data science.
www.nature.com
Reposted by Manuel Gomez Rodriguez
jugander.bsky.social
Obama, speaking in October 2016: "Government will never run the way Silicon Valley runs because, by definition, democracy is messy. And part of government's job, by the way, is dealing with problems that nobody else wants to deal with." 1/6 youtu.be/BikQFWNYct4?...
White House Frontiers Conference
YouTube video by The Obama White House
youtu.be
autreche.bsky.social
Check out an implementation of our model on several LLMs from the Llama family at github.com/Networks-Lea....

This has been a joint effort with multiple members of my group: Nina Corvelo Benz, Stratis Tsirtsis, Eleni Straitouri, Ivi Chatzi, Ander Artola Velasco & Suhas Thejaswi.
GitHub - Networks-Learning/coupled-llm-evaluation: Code for "Evaluation of Large Language Models via Coupled Token Generation", Arxiv 2025.
Code for "Evaluation of Large Language Models via Coupled Token Generation", Arxiv 2025. - Networks-Learning/coupled-llm-evaluation
github.com
autreche.bsky.social
This suggests that the apparent advantage of a LLM over others in existing evaluation protocols may not be genuine but rather confounded by the randomness inherent to the generation process. Our model is easy to implement and does not require any finetuning/prompt engineering 5/
autreche.bsky.social
On evaluations based on (human) pairwise comparisons, we show that coupled and standard autoregressive generation can surprisingly lead to different rankings when comparing more than two LLMs, even with an infinite amount of samples 4/
autreche.bsky.social
On evaluations on benchmark datasets, we show that coupled autoregressive generation leads to the same conclusions as standard autoregressive generation but using provably fewer samples. For example, on MMLU, coupled autoregressive generation requires up to 40% fewer samples 3/
autreche.bsky.social
Our key idea is to couple the autoregressive processes of a set of LLMs under comparison, particularly their samplers, by means of sharing the same source of randomness. Loosely speaking, coupled autoregressive generation ensures that no LLM will have better luck than others 2/
autreche.bsky.social
LLMs rely on randomization to respond to a prompt: they may respond differently to the same prompt if asked multiple times. In “Evaluation of LLMs via Coupled Token Generation” (arxiv.org/abs/2502.01754), we argue that the eval of LLMs should control for this randomization 1/
Evaluation of Large Language Models via Coupled Token Generation
State of the art large language models rely on randomization to respond to a prompt. As an immediate consequence, a model may respond differently to the same prompt if asked multiple times. In this wo...
arxiv.org
Reposted by Manuel Gomez Rodriguez
jennwv.bsky.social
The FATE group at @msftresearch.bsky.social NYC is accepting applications for 2025 interns. 🥳🎉

For full consideration, apply by 12/18.

jobs.careers.microsoft.com/global/en/jo...

Interested in AI evaluation? Apply for the STAC internship too!

jobs.careers.microsoft.com/global/en/jo...
autreche.bsky.social
Very excited and grateful to receive an ERC Consolidator grant #ERCCoG on Counterfactuals in Minds and Machines! Are you interested in a postdoc or Ph.D.? Please get in touch by email!
learning.mpi-sws.org
Reposted by Manuel Gomez Rodriguez
stratiss.bsky.social
What would an LLM have said, counterfactually? Here is a short video illustrating our method for counterfactual token generation. We will present this work at the CaLM workshop at #neurips2024. See you in Vancouver!
📜 arxiv.org/abs/2409.17027
💻 made with manim in python
autreche.bsky.social
Tenure-track openings at MPIs in all areas of CS and its intersection with other discipline, deadline Dec 1st. I cannot think of better positions in Europe for young researchers who want to start a group! Ping me if you want to hear my personal experience! apply.cis.mpg.de/register/ttf...
Register - Application System - Computer and Information Science @ Max Planck Society
apply.cis.mpg.de