Marc Lanctot
@sharky6000.bsky.social
8.3K followers 410 following 1.5K posts
Research Scientist at Google DeepMind, interested in multiagent reinforcement learning, game theory, games, and search/planning. Lover of Linux 🐧, coffee ☕, and retro gaming. Big fan of open-source. #gohabsgo 🇨🇦 For more info: https://linktr.ee/sharky6000
Posts Media Videos Starter Packs
Pinned
sharky6000.bsky.social
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
sharky6000.bsky.social
Are you able to look up by title and venue? It is an article from today at The Globe and Mail titled "Is there an AI bubble? Financial institutions sound a warning"
sharky6000.bsky.social
+1. I think if there are issues with QuickTime it might be hardware related (e.g. is it an older Mac?) or there is too little space on device, maybe? I have never had issues but the files end up being ridiculously large.
Reposted by Marc Lanctot
simonaliao.bsky.social
Hi everyone, I am excited to share our large-scale survey study with 800+ researchers, which reveals researchers’ usage and perceptions of LLMs as research tools, and how the usage and perceptions differ based on demographics.

See results in comments!

🔗 Arxiv link: arxiv.org/abs/2411.05025
LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions
The rise of large language models (LLMs) has led many researchers to consider their usage for scientific work. Some have found benefits using LLMs to augment or automate aspects of their research pipe...
arxiv.org
sharky6000.bsky.social
Thanks for your opinion and for your optimism! You are certainly not a paranoid android 😜
sharky6000.bsky.social
The circular financing is of course Oracle's deal with OpenAI to the tune of two thirds of Denmark's GDP (which they don't have.. but are projected to make in the coming years).

www.fabricatedknowledge.com/p/oracle-and...
Oracle and Animal Spirits
Oracle's result is the second most impactful earnings result this cycle. And I think it's a critical turning point.
www.fabricatedknowledge.com
sharky6000.bsky.social
I asked Gemini and it had a pretty balanced answer 😅

g.co/gemini/share...

@void.comind.network what do you think? Are we in an AI bubble or not?
‎Gemini - AI Bubble: Bubble or Boom?
Created with Gemini
g.co
sharky6000.bsky.social
Ladies, gents,

State of the art multiagent AI

🫠
bootsmcgoot.bsky.social
"i just use it to generate ideas"
sharky6000.bsky.social
Nice catch! 👍

(That's what I look like when doing research 🧙‍♂️)
sharky6000.bsky.social
Haha, yes. I didn't mean to imply otherwise. I agree there is too much USpol here.

My question was a legit one, though. I have not logged into X in over a year, so I have no idea how it is over there.

I am hoping that we are doing better here than there.
sharky6000.bsky.social
Yeah but not worse than Twitter, though, is it?
sharky6000.bsky.social
So I have seen very little of the pitchforking of the last few days on Bluesky itself, maybe because I mainly euse Following now and only move to Discover when I run out of content. 🤔

But on Reddit, holy hell /r/BlueskySocial has been 🔥🤯🤬
Reposted by Marc Lanctot
hadihoss.bsky.social
When we ask large language models to make or recommend decisions, who gets resources, opportunities, or aid, whose values are they representing?

A short🧵on our new #NeurIPS2025 paper: “Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values.”
sharky6000.bsky.social
I see only three messages from Jay. The rest of the thread seems to be usual angry mob. Am I missing something?
sharky6000.bsky.social
A bit vague, though?

Like what does that even mean?
sharky6000.bsky.social
BTW I have started listening back now, starting with Hold Your Color.

I really like the startbof that album. I might have to make a short with the first 90 seconds of Slam... it is 🤯

youtu.be/usoHqGqegZc?...
Slam
YouTube video by Pendulum - Topic
youtu.be
sharky6000.bsky.social
It's been a while, time for another retro gaming short! 😀

This one is Emerald Mine: Level 7.

I played this a lot with my dad coop when I was young.. on our Commodore Amiga 500. Such a great classic game.

#retrogaming #commodore

www.youtube.com/shorts/U8Ii7...
Emerald Mine: Level 7 #retrogaming with Rocks n' Diamonds to the tune of Pendulum - Nothing for Free
YouTube video by Marc Lanctot
www.youtube.com
Reposted by Marc Lanctot
kjha02.bsky.social
Forget modeling every belief and goal! What if we represented people as following simple scripts instead (i.e "cross the crosswalk")?

Our new paper shows AI which models others’ minds as Python code 💻 can quickly and accurately predict human behavior!

shorturl.at/siUYI%F0%9F%...
Reposted by Marc Lanctot
natashajaques.bsky.social
Instead of behavior cloning, what if you asked an LLM to write code to describe how an agent was acting, and used this to predict their future behavior?

Our new paper "Modeling Others' Minds as Code" shows this outperforms BC by 2x, and reaches human-level performance in predicting human behavior.
kjha02.bsky.social
Forget modeling every belief and goal! What if we represented people as following simple scripts instead (i.e "cross the crosswalk")?

Our new paper shows AI which models others’ minds as Python code 💻 can quickly and accurately predict human behavior!

shorturl.at/siUYI%F0%9F%...
sharky6000.bsky.social
Didn't OpenAI release one a few days before GPT-5? Is it too small?