Kyle Lo @ COLM 2025 🍁
@kylelo.bsky.social
6.5K followers 590 following 500 posts
language model research @ai2.bsky.social, Co-lead of Data for OLMo w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,🧋 kyleclo.com
Posts Media Videos Starter Packs
Pinned
kylelo.bsky.social
we released olmo 32b today! ☺️

🐟our largest & best fully open model to-date
🐠right up there w similar size weights-only models from big companies on popular benchmarks
🐡but we used way less compute & all our data, ckpts, code, recipe are free & open

made a nice plot of our post-trained results!✌️
kylelo.bsky.social
flyin to #colm2025 along w bunch of the @ai2.bsky.social team

come chat w me about pretraining horror stories, data & evals, what we're cookin for next olmo, etc

made a 🔥 poster for thursday sess, come say hi
kylelo.bsky.social
same flight lol I just got to airport way too early
kylelo.bsky.social
5 am airport for the only direct flight from seattle to montreal #colm2025
kylelo.bsky.social
synthetic data mimics real data's rough shape, modality, types, schema, etc. but with fake values. models these days are quite proficient at operating over data of this type & generating reasonable code; the main contrib here is system design to replace the repetitive data exploratory workflow
kylelo.bsky.social
hehe i didnt do anythin!

core is data voyager (arxiv.org/abs/2402.13610) but local LM instead of GPT

it generates code (map-reduce-filter) that transforms data (csvs), a federated platform executes & returns some output back to system. system repeatedly interprets + generates more code
Data-driven Discovery with Large Generative Models
With the accumulation of data at an unprecedented rate, its potential to fuel scientific discovery is growing exponentially. This position paper urges the Machine Learning (ML) community to exploit th...
arxiv.org
kylelo.bsky.social
not my project but I rlly like it

working w cancer research center to analyze clinical data, but private data cant leave the center.

so the team developed a tool that generates code for remote execution by the cancer center, developed on synthetic data, and now tested for realsies 🤩
kylelo.bsky.social
had to explain to first time submitter why AC recommended accept ended up as reject 😮‍💨 been publishing long enough that i get why such things happen but can be rough
kylelo.bsky.social
oh dang, missed this paper, this is rlly nice thx!
kylelo.bsky.social
high dim + discrete space (tokens). back in soft prompts days, gradients made high dim easier to handle cuz continuous space. high dim search wout gradients is tough
kylelo.bsky.social
LM benchmark design requires 3 decisions, how to:
🐟 select test cases
🐠 score LM on each test
🦈 aggregate scores to estimate perf

fluid benchmarking is simple:
🍣 find max informative test cases
🍥 estimate 'ability', not simple avg perf

why care? turn ur grey noisy benchmarks to red ones!
kylelo.bsky.social
can also view this as just candidate selection & push all “late interaction” or anything too complex for cosine sim to neural reranker(s)
kylelo.bsky.social
working on a similar project now actually 😮 did u happen to see if ppl do well on this test on human reasoning steps
kylelo.bsky.social
have been believer in decomposing queries to many atomic units, each triggering its own retrieval, and assembling results after. feels like this has always been the thing that works, even if less elegant than an “end to end learned” approach

arxiv.org/abs/2305.15053
Decomposing Complex Queries for Tip-of-the-tongue Retrieval
When re-finding items, users who forget or are uncertain about identifying details often rely on creative strategies for expressing their information needs -- complex queries that describe content ele...
arxiv.org
kylelo.bsky.social
“people didn’t want to buy it because they thought that a third of a pound was less than a quarter pound because three is less than four”

lol we were clowning on models for 9.11 > 9.9 but prolly should’ve checked human baseline
Reposted by Kyle Lo @ COLM 2025 🍁
natolambert.bsky.social
COLM is coming up! Very excited. I'm starting to figure out two things:
1. A small invite-only dinner for Interconnects AI (Ai2 event news later).
2. Various research chats and catchups.
Fill out the form below or email me if you're interested :) 🍁🇨🇦
Interest form: buff.ly/9nWBxZ9
Reposted by Kyle Lo @ COLM 2025 🍁
ai2.bsky.social
Ai2 @ai2.bsky.social · Aug 28
🎙️ Say hello to OLMoASR—our fully open, from-scratch speech-to-text (STT) model. Trained on a curated audio-text set, it boosts zero-shot ASR and now powers STT in the Ai2 Playground. 👇
kylelo.bsky.social
looks like the preprint has been updated to include a disclaimer that this was a class project & intentionally provocatively written 😐
kylelo.bsky.social
we r quite strict on ourselves lolol