Mehmet Mars Seven ♞
@mehmetmars7.bsky.social
2.6K followers 2K following 240 posts
Lecturer @kcl-spe.bsky.social @kingscollegelondon.bsky.social Game Theory, Econ & CS, Pol-Econ, Sport Chess ♟️ Game Theory Corner at Norway Chess Studied in Istanbul -> Paris -> Bielefeld -> Maastricht https://linktr.ee/drmehmetismail Views are my own
Posts Media Videos Starter Packs
Pinned
mehmetmars7.bsky.social
I'm sure many names are missing, but even putting this list together took some time. Please feel free to suggest more names from any discipline, economics, computer science, political science, biology, mathematics, and so on, either via DM or by simply tagging them below

go.bsky.app/RjAhP7Z
Reposted by Mehmet Mars Seven ♞
unet.bsky.social
The UNET seminars continue!
📅 Tuesday, Oct 14, 2–3 pm
📍 Room C43 + online
Our next speaker is Mehmet Mars Seven (@mehmetmars7.bsky.social, King’s), presenting “Elo performance in sports.”
🔗 kcl.ac.uk/people/mehmet-ismail
mehmetmars7.bsky.social
AI getting better at math won't displace mathematicians, it will amplify the foxes (the wonderers like Terry Tao) more than the hedgehogs (the hyper-specialized)
Reposted by Mehmet Mars Seven ♞
hadihoss.bsky.social
When we ask large language models to make or recommend decisions, who gets resources, opportunities, or aid, whose values are they representing?

A short🧵on our new #NeurIPS2025 paper: “Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values.”
mehmetmars7.bsky.social
Reminded me of the perfect coincidence in Turkey this summer: a call for rain prayers after a long drought was made just as the Meteorology Service warned of three days of heavy rain across the country

www.nber.org/system/files...
h/t Arpit Gupta
Reposted by Mehmet Mars Seven ♞
mirandayaver.bsky.social
We have officially entered the psychogenic fever stage of the academic year and overcommitment.
mehmetmars7.bsky.social
Leaving the working here for now (results sound intuitive at first glance), but definitely looking forward to exploring it further
arkadykonovalov.bsky.social
📢 New (economics) preprint 📢

(with Jiaxin Yu)

Excited to share our study on the cooperation in the infinitely repeated Prisoner's Dilemma, where we use payoff manipulations and all kinds of methods to disentangle strategies, beliefs, and preferences: papers.ssrn.com/sol3/papers....
Reposted by Mehmet Mars Seven ♞
eugenevinitsky.bsky.social
We're finally out of stealth: percepta.ai
We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, come join us 😀
Percepta | A General Catalyst Transformation Company
Transforming critical institutions using applied AI. Let's harness the frontier.
percepta.ai
mehmetmars7.bsky.social
Are top chess players good at math?

via Esports World Cup
mehmetmars7.bsky.social
This stuff is real. A couple of years ago, Maastricht University got hacked, and the students and staff couldn't access the systems for some time, creating a mini chaos which was later resolved
www.bbc.co.uk/news/article...
'You'll never need to work again': Criminals offer reporter money to hack BBC
Reporter Joe Tidy was offered money if he would help cyber criminals access BBC systems.
www.bbc.co.uk
Reposted by Mehmet Mars Seven ♞
vincecrawford.bsky.social
I am pleased to announce the publication of "Expectations-based Reference-Dependence and Labor Supply: Eliciting Cabdrivers’ Expectations in the Field" (with Miao Jin, Juanjuan Meng, and Lan Yao), JEBO special issue in honor of Gary Charness. Open access here www.sciencedirect.com/science/arti...
ScienceDirect.com | Science, health and medical journals, full text articles and books.
urldefense.com
mehmetmars7.bsky.social
Perhaps half of the papers I was asked to referee this year wouldn’t have passed the desk before LLMs, but now they look OK at first. It takes much more time to tell whether they really should pass the desk. I think this makes the editors’ job harder.
mehmetmars7.bsky.social
Best of luck with the new challenge!
mehmetmars7.bsky.social
Somewhat funny: before ChatGPT, I had never seen a completely made-up reference. It is the kind of error only an LLM could make.
mehmetmars7.bsky.social
Colleagues in higher ed: how is your institution responding to AI use?
Are there university-wide initiatives for in-person exams, oral exams, and/or AI detection policies?
Is the pace of AI so fast that no single uni can realistically keep up?
Do you know if universities coordinating their policies?
mehmetmars7.bsky.social
The owners probably thought "nobody will read this"
mehmetmars7.bsky.social
3/ both models confused some basic game theory facts mixed with some correct statements.
mehmetmars7.bsky.social
2/ Result? Both models failed. Spectacularly.
GPT-5 Pro: gave a mathy polished answer… but wrong.
Gemini Deep Think: less formal… but also wrong. 😅
mehmetmars7.bsky.social
🧵GPT-5 Pro vs Gemini Deep Think on #gametheory round 2:

1/ After GPT pro performed better on a novel question, I tried something simpler: essentially a “database-style” lookup query. The answer is known and exists in the literature: no reasoning required.
mehmetmars7.bsky.social
11/ Here's the Gemini conversation that includes the first prompt
g.co/gemini/share...
The original paper: Super-Nash Performance:
doi.org/10.1111/iere...
mehmetmars7.bsky.social
10/ With more complex game-theory problems, both models fail. I think that the overall progress is visible, so we may be inching toward a functional mathematics / game theory "engine" but nothing close to AGI or whatever.
mehmetmars7.bsky.social
9/ Reliability: Neither model is reliable enough to trust without an existing solution. Everything must be checked manually. That verification price often eats up the productivity gains. (I tested many more things.)
mehmetmars7.bsky.social
8/ Instruction-following: GPT-5 Pro: generally decent; kept formats, echoed assumptions, flagged inconsistencies. Deep Think: drifted, forgot promised items, sometimes delivered nothing after long thinks. Once returned nothing after 1.5 days!