Jay Alammar
@jayalammar.bsky.social
1.3K followers 210 following 16 posts
Writer http://jalammar.github.io. O'Reilly Author http://LLM-book.com. LLM Builder Cohere.com.
Posts Media Videos Starter Packs
Pinned
jayalammar.bsky.social
The Illustrated DeepSeek-R1

Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it.

newsletter.languagemodels.co/p/the-illust...
jayalammar.bsky.social
The Illustrated GPT-OSS

New post! A visual tour of the architecture, message formatting, and reasoning of the latest GPT.

newsletter.languagemodels.co/p/the-illust...
jayalammar.bsky.social
The legendary John Carmack at #upperbound:
- Current AI focus is RL (with Richard Sutton) solving Atari games
- Thinking in line with the Alberta Plan.
- It was a misstep to start working too low-level (e.g., at the cuda level). I kept stepping up the stack chain until now in pytorch
Reposted by Jay Alammar
astroadamh.bsky.social
I'm really excited for this year's PyData London conference - there are some awesome talks on the schedule and I'm excited to hear the keynote speakers @jayalammar.bsky.social, Tony Wears, & Leanne Fitzpatrick

#pydata #datascience
pydatalondon.bsky.social
Unleash your inner data aficionado at PyData London 2025, 6-8 June at Convene Sancroft, St. Paul’s!

We have 3 top flight keynotes lined up for you this year from @jayalammar.bsky.social, Leanne Kim Fitzpatrick and Tony Mears.

Just 17 days left. Book your tickets now!
pydata.org/london2025
Advertisement for PyData London 2025 conference.

Headline: Meet your keynote speakers

- Jay Alammar
- Tony Mears
- Leanne Fitzpatrick

Book your tickets
https://pydata.org/london2025
Reposted by Jay Alammar
pydatalondon.bsky.social
Unleash your inner data aficionado at PyData London 2025, 6-8 June at Convene Sancroft, St. Paul’s!

We have 3 top flight keynotes lined up for you this year from @jayalammar.bsky.social, Leanne Kim Fitzpatrick and Tony Mears.

Just 17 days left. Book your tickets now!
pydata.org/london2025
Advertisement for PyData London 2025 conference.

Headline: Meet your keynote speakers

- Jay Alammar
- Tony Mears
- Leanne Fitzpatrick

Book your tickets
https://pydata.org/london2025
Reposted by Jay Alammar
maxbartolo.bsky.social
I'm excited to share the tech report for our @cohere.com @cohereforai.bsky.social Command A and Command R7B models. We highlight our novel approach to model training including self-refinement algorithms and model merging techniques at scale. Read more below! ⬇️
Reposted by Jay Alammar
tomaarsen.com
We've just released MMTEB, our multilingual upgrade to the MTEB Embedding Benchmark!

It's a huge collaboration between 56 universities, labs, and organizations, resulting in a massive benchmark of 1000+ languages, 500+ tasks, and a dozen+ domains.

Details in 🧵
Reposted by Jay Alammar
maartengr.bsky.social
Did you know we continue to develop new content for the "Hands-On Large Language Models" book?

There's now even a free course available with
@deeplearningai.bsky.social!
Reposted by Jay Alammar
masonyoungblood.bsky.social
Do whales optimize their vocalizations for efficiency, just like human language? 🐋🎶 My latest study in
Science Advances (@science.org) suggests they do—following linguistic laws seen in human speech. 🧵 www.science.org/doi/10.1126/...
Language-like efficiency in whale communication
Whale vocalizations follow efficiency rules seen in human language, revealing striking similarities in communication systems.
www.science.org
Reposted by Jay Alammar
ellengarland.bsky.social
We uncovered the same statistical structure that is a hallmark of human language in whale song, published today in Science. @inbalarnon.bsky.social @simonkirby.bsky.social @jennyallen13.bsky.social @clairenea.bsky.social @emma-carroll.bsky.social
www.science.org/doi/10.1126/...
Reposted by Jay Alammar
nsaphra.bsky.social
One of my grand interpretability goals is to improve human scientific understanding by analyzing scientific discovery models, but this is the most convincing case yet that we CAN learn from model interpretation: Chess grandmasters learned new play concepts from AlphaZero's internal representations.
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero
Artificial Intelligence (AI) systems have made remarkable progress, attaining super-human performance across various domains. This presents us with an opportunity to further human knowledge and improv...
arxiv.org
jayalammar.bsky.social
The Illustrated DeepSeek-R1

Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it.

newsletter.languagemodels.co/p/the-illust...
jayalammar.bsky.social
Alphaxiv is an awesome way to discuss ML papers -- often with the authors themselves. Here's an intro and demo by @rajpalleti.bsky.social we shot at #Neurips2024

www.youtube.com/watch?v=-Kwl...
AlphaXiv - a great place to discuss ML papers
YouTube video by Jay Alammar
www.youtube.com
Reposted by Jay Alammar
tomaarsen.com
The newest extremely strong embedding model based on ModernBERT-base is out: `cde-small-v2`. Both faster and stronger than its predecessor, this one tops the MTEB leaderboard for its tiny size!

Details in 🧵
jayalammar.bsky.social
Floored that the repo for Hands-On Large Language Models is now at 3.6k Github stars!

And excited that professors are starting to use the book to teach LLM courses. Reach out to us if we can be of assistance!

And if you've liked the book, leave us a review on Amazon or Goodreads!
jayalammar.bsky.social
SWE-Bench has been one of the most important tasks measuring the progress of agents tackling software engineering in 2024. I caught up with two of its creators, @ofirpress.bsky.social and Carlos E. Jimenez to share their ideas on the state of LLM-backed agents.

www.youtube.com/watch?v=bivZ...
SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
YouTube video by Jay Alammar
www.youtube.com
Reposted by Jay Alammar
natolambert.bsky.social
OpenAI's o3: The grand finale of AI in 2024
A step change as influential as the release of GPT-4. Reasoning language models are the current and next big thing.

I explain:
* The ARC prize
* o3 model size / cost
* Dispelling training myths
* Extreme benchmark progress
o3: The grand finale of AI in 2024
A step change as influential as the release of GPT-4. Reasoning language models are the current big thing.
buff.ly
jayalammar.bsky.social
Good morning #NeurIPS2024! Stop by the @cohere.com booth at 3PM today (Thursday) for a signed copy of Hands-On Large Language Models - it will introduce you to LLMs, their applications, as well as Cohere's Embed, Rerank, and Command-R models.

Come early as quantities are limited!
jayalammar.bsky.social
I'll be in the Cohere #NeurIPS2024 booth most of this afternoon. Come say hi, ask questions, and yes, we're hiring!

Tomorrow I'll be signing copies of my book at 3PM! Limited copies available!
jayalammar.bsky.social
Hi NeurIPS!

Explore ~4,500 NeurIPS papers in this interactive visualization:

jalammar.github.io/assets/neuri...
(Click on a point to see the paper on the website)

Uses @cohere.com models and @lelandmcinnes.bsky.social's datamapplot/umap to help make sense of the overwhelming scale of NeurIPS.
jayalammar.bsky.social
Sure to be thought provoking. The previous interview had fascinating thoughts on scifi (Dune Vs. Foundation), on AI competition for AI safety, and on successful scifi as a self-preventing prophecy.
davidbrin.bsky.social
Tim Ventura interviewed me about big perspectives on AI. Can't put the AI genie back in the bottle, so how to make it safe? Come explore ethical, legal & safety implications of artificial intelligence.
youtu.be/34_Tub6vXDk
David Brin - Artificial Intelligence Safety
YouTube video by Tim Ventura
youtu.be
jayalammar.bsky.social
Excited to see you all at NeurIPS this year! Let's hang!
jayalammar.bsky.social
Join us for a panel on scientific communication Dec 4!
mziizm.bsky.social
I'm always thinking about the aspects of research that aren't discussed enough online, and one key area is effective scientific communication.

Very excited about this panel discussion featuring two amazing researchers: @jayalammar.bsky.social and @shaynelongpre.bsky.social
jayalammar.bsky.social
Looks good! Do you have to leave town for better visibility or is the light pollution not that bad?