Manfred Diaz
@manfreddiaz.bsky.social
2.3K followers 750 following 39 posts
Ph.D. Candidate at Mila and the University of Montreal, interested in AI/ML connections with economics, game theory, and social choice theory. https://manfreddiaz.github.io
Posts Media Videos Starter Packs
manfreddiaz.bsky.social
It hasn't disappointed thus far!
manfreddiaz.bsky.social
I was following this one during the COVID pandemic, but it has been inactive for quite some time. The original talks' recordings are amazing, though!
manfreddiaz.bsky.social
Yeah, it's been a period for all of us simultaneously! I have also been pretty busy with thesis/job search. Hopefully, it will be back running in the Fall term!
manfreddiaz.bsky.social
@aamasconf.bsky.social 2025 was very special for us! We had the opportunity. to present a tutorial on general evaluation of AI agents, and we got a best paper award! Congrats, @sharky6000.bsky.social and the team! 🎉
sharky6000.bsky.social
That's a wrap for day #4 @aamasconf.bsky.social . I did not present anything today but I am honored that we received the best paper award!

Thanks to everyone who made it happen! 👇 1/2
Reposted by Manfred Diaz
Reposted by Manfred Diaz
jzleibo.bsky.social
First LessWrong post! Inspired by Richard Rorty, we argue for a different view of AI alignment, where the goal is "more like sewing together a very large, elaborate, polychrome quilt", than it is "like getting a clearer vision of something true and deep"
www.lesswrong.com/posts/S8KYwt...
Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt — LessWrong
We can just drop the axiom of rational convergence.
www.lesswrong.com
manfreddiaz.bsky.social
The quality of London's museums is just amazing! Enjoy!
Reposted by Manfred Diaz
sharky6000.bsky.social
Our new evaluation method, Soft Condorcet Optimization is now available open-source! 👍

Both the sigmoid (smooth Kendall-tau) and Fenchel-Young (perturbed optimizers) versions.

Also, an optimized C++ implementation that is ~40X faster than the Python one. 🤩⚡

github.com/google-deepm...
Reposted by Manfred Diaz
sharky6000.bsky.social
Working at the intersection of social choice and learning algorithms?

Check out the 2nd Workshop on Social Choice and Learning Algorithms (SCaLA) at @ijcai.bsky.social this summer.

Submission deadline: May 9th.

I attended last year at AAMAS and loved it! 👍

sites.google.com/corp/view/sc...
SCaLA-25
A workshop connecting research topics in social choice and learning algorithms.
sites.google.com
manfreddiaz.bsky.social
If the AAMAS website is a good reference for this, it may not be, but uncertain atm.
manfreddiaz.bsky.social
Come to understand ML evaluation from first principles! We have put together a great AAMAS tutorial covering statistics, probabilistic models, game theory, and social choice theory.

Bonus: a unifying perspective of the problem leveraging decision-theoretic principles!

Join us on May 19th!
manfreddiaz.bsky.social
Re #2: The key finding there is that the stationary points of SCO contain the margin matrix but, as I said in the note, there is still more work to do!
manfreddiaz.bsky.social
Thanks! I have been meaning to update the manuscript to standalone without the main paper but instead I may have change the content to a different format 😉. Coming soon!
manfreddiaz.bsky.social
Ah, I see the confusion... I never used the "identically distributed assumption," only the independence assumption (from 8 to 9).
manfreddiaz.bsky.social
I'm not sure if I understood your question correctly, but yes? As the post you shared says, "Voila! We have shown that minimizing the KL divergence amounts to finding the maximum likelihood estimate of θ." Maybe I am missing your point 😬
manfreddiaz.bsky.social
Elo drives most LLM evaluations, but we often overlook its assumptions, benefits, and limitations. While working on SCO, we wanted to understand the SCO-Elo distinction, so I looked and uncovered some intriguing findings and documented them in these notes. I hope you find them valuable!
sharky6000.bsky.social
Btw, if you stare at the derivation of Elo as logistic regression, SCO is really quite close to Elo. The difference is that Elo uses a classification objective (cross entropy loss) on top of the output of the sigmoid.

@manfreddiaz.bsky.social dug even deeper: manfreddiaz.github.io/assets/pdf/s...
manfreddiaz.github.io
Reposted by Manfred Diaz
sharky6000.bsky.social
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
manfreddiaz.bsky.social
I had the convexity results for the online pairwise update (Section B.1.1.1) in my notes (manfreddiaz.github.io/assets/pdf/s...), but it is not clear to me if they hold for the other non-online settings. Worth taking a more detailed pass over the paper!
manfreddiaz.github.io
manfreddiaz.bsky.social
That's a nice finding, @sacha2.bsky.social! @sharky6000.bsky.social I skimmed over it, and it seems neat! There is an important distinction, though. They work with the "online" Elo regime, departing from the traditional gradient/batch gradient descent updates. (e.g., FIDE doesn't use online updates)