@tscholak.bsky.social
25 followers 28 following 7 posts
Lead Research Scientist @servicenowresearch.bsky.social. All opinions my own.
Posts Media Videos Starter Packs
tscholak.bsky.social
Huge thanks and congrats to the SLAM team and @servicenowresearch.bsky.social 🙌❤️
And a special shoutout to Sathwik, best co-lead anyone could ask for.
tscholak.bsky.social
🧠 Researchers: run it
🧰 Engineers: fine-tune it
🧪 Builders: break it
Tell us what you find.
Apriel-5B models are permissively licensed (MIT) and ready to chat.
#Apriel #LLM #AI #OpenWeights #FastLLM #SLAM #ServiceNow #ServiceNowResearch
tscholak.bsky.social
Apriel is our proving ground:
🧪 Fast, cheap, high-quality model training
📦 Compact models that generalize well
This is just the start.
tscholak.bsky.social
And we did it with just:
🖥️ 480 x H100s
⏱️ ~91,000 H100-hours
🧮 4.8B params, bfloat16
💸 2.3 x fewer GPU hours than OLMo-2-7B
Thanks to Fast-LLM, github.com/ServiceNow/F..., our custom training stack for speed and scale. No hacks. Just better infra.
GitHub - ServiceNow/Fast-LLM: Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research - ServiceNow/Fast-LLM
github.com
tscholak.bsky.social
📊 Benchmarks (lm-eval-harness):
💥 Beats OLMo-2-7B-Instruct and Mistral-Nemo-12B-Instruct on avg
💥 Competitive with LLama-3.1-8B-Instruct, beats it in math benchmarks and IF Eval
tscholak.bsky.social
We're releasing:
🧠 Apriel-5B-Base: pretrained, general-purpose decoder
🧑‍🏫 Apriel-5B-Instruct: chat-style variant for aligned outputs
Trained on 4.5T+ tokens.
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Base
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Instruct
tscholak.bsky.social
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨
Speed ⚡ + Accuracy 📈 + Efficiency 💸
This model punches above its weight, beating bigger LLMs while training on a fraction of the compute.
Built with Fast-LLM, our in-house training stack.
🧵👇