more at https://soldaini.net
state of the art OCR, fully open model:
state of the art OCR, fully open model:
OF COURSE my code is better than yours
OF COURSE my code is better than yours
*ordering an ice lattes rather than espressos at coffee shops
*ordering an ice lattes rather than espressos at coffee shops
101 can be as dark red as you want on google maps!
101 can be as dark red as you want on google maps!
...I am considering allowing calendar notifications tho cuz I almost missed 3 meetings already 😅
...I am considering allowing calendar notifications tho cuz I almost missed 3 meetings already 😅
million of caveats but we have models that pick up capabilities from plain text???
it's so magical, I can't believe we got such treat
million of caveats but we have models that pick up capabilities from plain text???
it's so magical, I can't believe we got such treat
As a meta point, I’m very grateful to be in a position where I can put my technical expertise in the service of policy needs 🥰
As a meta point, I’m very grateful to be in a position where I can put my technical expertise in the service of policy needs 🥰
We release efficient classifiers 🌐 to partition large corpora, and use them to improve sampling for LLM pretraining
great work lead by @awettig.bsky.social 👇
In our new paper, we propose WebOrganizer which *constructs domains* based on the topic and format of CommonCrawl web pages 🌐
Key takeaway: domains help us curate better pre-training data! 🧵/N
We release efficient classifiers 🌐 to partition large corpora, and use them to improve sampling for LLM pretraining
great work lead by @awettig.bsky.social 👇
congrats @soldaini.net for heavy lifting, showing our OLMoE model can run on iPhones📱
As phones get faster, more AI will happen on device. With OLMoE, researchers, developers, and users can get a feel for this future: fully private LLMs, available anytime.
Learn more from @soldaini.net👇 youtu.be/rEK_FZE5rqQ
congrats @soldaini.net for heavy lifting, showing our OLMoE model can run on iPhones📱
All from better annealing and post train. Didn’t need to redo pre training. Goes to show how much potential these models have!
new instruct model: huggingface.co/allenai/OLMo...
All from better annealing and post train. Didn’t need to redo pre training. Goes to show how much potential these models have!
new instruct model: huggingface.co/allenai/OLMo...
We also trained new OLMoE-1B-7B-0125 this time using the Tulu 3 recipe. Very exciting that RLVR improved gsm8k by almost 10 points for OLMoE 🔥
A quick 🧵
We also trained new OLMoE-1B-7B-0125 this time using the Tulu 3 recipe. Very exciting that RLVR improved gsm8k by almost 10 points for OLMoE 🔥
A quick 🧵
We are launching an iOS app–it runs OLMoE locally 📱 We're gonna see more on-device AI in 2025, and wanted to offer a simple way to prototype with it
App: apps.apple.com/us/app/ai2-o...
Code: github.com/allenai/OLMo...
Blog: allenai.org/blog/olmoe-app
As phones get faster, more AI will happen on device. With OLMoE, researchers, developers, and users can get a feel for this future: fully private LLMs, available anytime.
Learn more from @soldaini.net👇 youtu.be/rEK_FZE5rqQ
We are launching an iOS app–it runs OLMoE locally 📱 We're gonna see more on-device AI in 2025, and wanted to offer a simple way to prototype with it
App: apps.apple.com/us/app/ai2-o...
Code: github.com/allenai/OLMo...
Blog: allenai.org/blog/olmoe-app