Michael Saxon
banner
saxon.me
Michael Saxon
@saxon.me
Doctor of NLP/Vision+Language from UCSB

Evals, metrics, multilinguality, multiculturality, multimodality, and (dabbling in) reasoning

https://saxon.me/
Pinned
🆕 from us at #EMNLP: Are LMs better at answering questions about Germany in German than in French? Is national knowledge linguistically contingent?

Interestingly, only for some multilingual models is this true. Aya knows China best in Chinese, but LLaMA's best in English always.
Reposted by Michael Saxon
I've written many reviews and received several top reviewer awards. I've also written some absolute dogwater critiques based on skimming at the last second with a fever. My point is it's totally random, it's not just whether you rolled a decent reviewer but whether they've had lunch that day
November 23, 2025 at 11:58 PM
And here is the presentation I gave on networking, self-promo, and how to make the most out of a conference. Hope this helps for everyone at NeurIPS!

www.youtube.com/watch?v=B9hG...
Conferencemaxxing: How to grow your profile and network as a scientist
YouTube video by Michael Saxon (NLP & Generative AI research)
www.youtube.com
November 19, 2025 at 11:59 PM
In a few hours (11/19, 2PM PST) I will be giving this lecture on "conferencemaxxing" to help students prepare to make the most out of NeurIPS.

This lecture is open to the public. If you're interested in joining, here's a GCal invite link: calendar.google.com/calendar/eve...
November 19, 2025 at 7:26 PM
Trying to decide what to do on the first day of #NeurIPS2025?

Check out my, @marstin.bsky.social and @xiangyue96.bsky.social's tutorial, "The Science of Benchmarking: What's Measured, What's Missing, What's Next" on December 2 from 1:30 to 4:00pm.

benchmarking.science

What will we cover?

1/3
November 18, 2025 at 3:49 AM
Rolled a custom (read: relatively privacy respecting) custom visitor map stack in 2.5h today with cursor
November 15, 2025 at 10:00 PM
Normalize questioning the utility of mathiness in ML conference papers!

Are the equations supporting an argument or are they just a fancy way to express something simple? Do introduced terms do anything or get referenced anywhere?

I find the answer is usually no in the kinds of papers I review
November 14, 2025 at 4:55 PM
Reposted by Michael Saxon
still uncertain whether inviting all of internet to gawk at long-tailed instances of spectacular review outliars is a good productive thing
November 14, 2025 at 3:35 PM
Reposted by Michael Saxon
Our libraries are cutting staff so that Elsevier can have its 32% profit margin
A staggering statistic: "North American researchers were charged over US$2.27 billion by just two for-profit publishers. The Canadian research councils and the US National Science Foundation were allocated US$9.3 billion in that year." What are we doing?
We wrote the Strain on scientific publishing to highlight the problems of time & trust. With a fantastic group of co-authors, we present The Drain of Scientific Publishing:

a 🧵 1/n

Drain: arxiv.org/abs/2511.04820
Strain: direct.mit.edu/qss/article/...
Oligopoly: direct.mit.edu/qss/article/...
November 14, 2025 at 1:37 AM
Based. I'm pretty much a full agree on these takes
Following up on Monday’s discussion, I articulate a few concrete positions on archives, surveys, and position papers.
The DOI Directorate
Articulating a few concrete positions on archives, surveys, and position papers
www.argmin.net
November 12, 2025 at 6:58 PM
Humanity is nothing without its humanity
November 11, 2025 at 10:00 AM
Guys I'm really worried about the threat of superintelligent AI, and wouldn't you know it, the best way to stop it is gonna be for you to give me a whole lotta money for my startup
November 10, 2025 at 6:35 AM
More than choosing good project ideas, to me "research taste" means recognizing what the interesting part of a result is and how it connects to a bigger narrative. Almost any nontrivial result can be important within the right lens.

More than anything my PhD taught me this.
November 5, 2025 at 8:25 PM
🆕 from us at #EMNLP: Are LMs better at answering questions about Germany in German than in French? Is national knowledge linguistically contingent?

Interestingly, only for some multilingual models is this true. Aya knows China best in Chinese, but LLaMA's best in English always.
November 5, 2025 at 7:47 PM
Beautiful Blooj tears. Mariners will not have to feel the "should have been us" pain
Hate da Dodgers but also Bloojays need to pay for knocking out Seattle. I'll be smug whoever loses.
November 2, 2025 at 4:23 AM
Hate da Dodgers but also Bloojays need to pay for knocking out Seattle. I'll be smug whoever loses.
November 2, 2025 at 4:11 AM
November 1, 2025 at 6:01 AM
Very pro dislike button. I think the asymmetry that comes from being able to leave drive-by approval (likes) but only high-engagement disapproval (comments) raises the temperature of negative interactions, and incentivizes ragebaiting with less visible shame
October 31, 2025 at 11:54 PM
I didn't realize arXiv is a postprint server
blog.arxiv.org/2025/10/31/a...

FYI the blog post for the updated policy is out. Our llm future is dire:/
October 31, 2025 at 7:55 PM
It's #NSF #GRFP application season again so it's time to re-up my GRFP application advice post!

Also, check out the cool bsky comment integration I've added to the blog! Engagement with this post will go under the blogpost on my site as comments!

saxon.me/blog/2024/gr...
NSF GRFP Application Tips for NLP, AI, CS
Reflections and advice from my successful NSF GRFP proposal in NLP. Why I think my applications worked well, what I wish I did differently, and links to my actual statements and feedback from the GRFP...
saxon.me
October 30, 2025 at 8:03 PM
"It's country over club as Aaron Judge will lead team USA in the World Baseball Classic next March, assuming this game will be over by then" 💀
October 28, 2025 at 6:00 AM
Reposted by Michael Saxon
fukuyama was right. history ended. we are stuck inside this game forever
October 28, 2025 at 5:28 AM
WAITER? ANOTHER INNING OF NOTHING PLEASE
October 28, 2025 at 5:00 AM
It's live! Here's an example post: saxon.me/blog/2025/la...

Turning the replies to a bluesky post into the comment section for a blogpost is a small concrete way to support the ecosystem: future visitors who want to add comments incentivized to interact on the platform

Also, it's very easy to do:
October 27, 2025 at 6:50 PM
Stage four: The sign bears no relation to any reality whatsoever; it is its own pure simulacrum

youtu.be/6i2I3dkZ5-M
Bush Step! (JibJab)
YouTube video by pipo
youtu.be
October 27, 2025 at 4:04 PM