tscholak.bsky.social
@tscholak.bsky.social
Lead Research Scientist @servicenowresearch.bsky.social. All opinions my own.
Huge thanks and congrats to the SLAM team and @servicenowresearch.bsky.social 🙌❤️
And a special shoutout to Sathwik, best co-lead anyone could ask for.
April 11, 2025 at 8:16 PM
🧠 Researchers: run it
🧰 Engineers: fine-tune it
🧪 Builders: break it
Tell us what you find.
Apriel-5B models are permissively licensed (MIT) and ready to chat.
#Apriel #LLM #AI #OpenWeights #FastLLM #SLAM #ServiceNow #ServiceNowResearch
April 11, 2025 at 8:15 PM
Apriel is our proving ground:
🧪 Fast, cheap, high-quality model training
📦 Compact models that generalize well
This is just the start.
April 11, 2025 at 8:15 PM
And we did it with just:
🖥️ 480 x H100s
⏱️ ~91,000 H100-hours
🧮 4.8B params, bfloat16
💸 2.3 x fewer GPU hours than OLMo-2-7B
Thanks to Fast-LLM, github.com/ServiceNow/F..., our custom training stack for speed and scale. No hacks. Just better infra.
GitHub - ServiceNow/Fast-LLM: Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research - ServiceNow/Fast-LLM
github.com
April 11, 2025 at 8:15 PM
📊 Benchmarks (lm-eval-harness):
💥 Beats OLMo-2-7B-Instruct and Mistral-Nemo-12B-Instruct on avg
💥 Competitive with LLama-3.1-8B-Instruct, beats it in math benchmarks and IF Eval
April 11, 2025 at 8:15 PM
We're releasing:
🧠 Apriel-5B-Base: pretrained, general-purpose decoder
🧑‍🏫 Apriel-5B-Instruct: chat-style variant for aligned outputs
Trained on 4.5T+ tokens.
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Base
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Instruct
April 11, 2025 at 8:14 PM