Kyle Lo
banner
kylelo.bsky.social
Kyle Lo
@kylelo.bsky.social
language model pretraining @ai2.bsky.social, co-lead of data research w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,🧋 kyleclo.com
yo endorse me for python skills
January 16, 2026 at 6:37 PM
u gotta shitpost more maria, ur content too informative 😆
January 15, 2026 at 9:54 PM
i appreciate bsky has less AI product advertising; i do want to see more memes/shitposting/fun stuff and insights from industry/open source sphere, even if they dont have an attached paper
January 15, 2026 at 5:50 PM
amaazinggg thxx 🙏🙏🙏
January 14, 2026 at 9:31 PM
ive been clicking around in UI but i cant find it 😭 pls help
January 14, 2026 at 9:27 PM
some notion of 'views/impressions'? it kinda sucks to post and only see a couple of likes & no replies. if there's some intermediate signal that shows people at least read the post, that'd incentivize more imo
January 14, 2026 at 8:16 PM
sports 🏈
January 9, 2026 at 10:05 PM
nope just an admirer of room 003
January 8, 2026 at 1:57 AM
some citation graphs data pipelines will create new "paper" nodes based on extracted bibstrings from PDFs

so in 2026 the papers we hallucinated in 2025 might end up being "real" papers on gscholar or sthn lol
January 6, 2026 at 1:07 AM
ya ur rite! we'll update it ✌️
December 20, 2025 at 12:34 AM
paper has:
🐟 more on our eval ideology
🦈 more baselines
🍣 more about RL Zero
etc

we picked final model (internally called moonlit surfer 🌛🏄) not just on bench scores but good vibes 🥰
December 12, 2025 at 6:03 PM
We have two spotlight papers
🥐 Signal and Noise (Wed) shows how noisy benchmarks prohibit fitting good task scaling laws & ways to improve
🥯 FlexOlmo (Thurs) is a novel MoE w/ experts trained on different data & control over expert activation based on access permissions to those datasets
December 1, 2025 at 9:51 PM
i agree w above; imo it's all whether can elicit capability from base model. saw RL recipes not work no matter what try & swap to a base model that had seen more relevant, in-domain data, then ez pz. reverse also true for working RL recipe but accidently borked base model
November 24, 2025 at 7:43 PM
yess!! sry bout the x-axis, still thinkin how to make figure clearer

it's exactly what you're saying -- each point refers to a stage of development. our release has data+ckpts+evals for all stages we use (figure) and wanted to show how it compares to other models which typically only few stages
November 21, 2025 at 10:00 PM
We're hiring too!

Olmo 3 was our biggest effort yet, but we're still a small team (67 authors!) compared to a lot of the big labs, which means everyone (especially interns) gets to own a major piece of the Olmo puzzle

job-boards.greenhouse.io/thealleninst...
Research Internship, OLMo
Seattle, WA
job-boards.greenhouse.io
November 20, 2025 at 6:20 PM