Lightnews — Scholar-powered news

@tscholak.bsky.social

25 followers 28 following 7 posts

Lead Research Scientist @servicenowresearch.bsky.social. All opinions my own.

Posts Media Videos Starter Packs

tscholak.bsky.social @tscholak.bsky.social · Apr 11

Huge thanks and congrats to the SLAM team and @servicenowresearch.bsky.social 🙌❤️
And a special shoutout to Sathwik, best co-lead anyone could ask for.

tscholak.bsky.social @tscholak.bsky.social · Apr 11

🧠 Researchers: run it
🧰 Engineers: fine-tune it
🧪 Builders: break it
Tell us what you find.
Apriel-5B models are permissively licensed (MIT) and ready to chat.
#Apriel #LLM #AI #OpenWeights #FastLLM #SLAM #ServiceNow #ServiceNowResearch

1 1

tscholak.bsky.social @tscholak.bsky.social · Apr 11

Apriel is our proving ground:
🧪 Fast, cheap, high-quality model training
📦 Compact models that generalize well
This is just the start.

tscholak.bsky.social @tscholak.bsky.social · Apr 11

And we did it with just:
🖥️ 480 x H100s
⏱️ ~91,000 H100-hours
🧮 4.8B params, bfloat16
💸 2.3 x fewer GPU hours than OLMo-2-7B
Thanks to Fast-LLM, github.com/ServiceNow/F..., our custom training stack for speed and scale. No hacks. Just better infra.

GitHub - ServiceNow/Fast-LLM: Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research

Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research - ServiceNow/Fast-LLM

github.com

tscholak.bsky.social @tscholak.bsky.social · Apr 11

📊 Benchmarks (lm-eval-harness):
💥 Beats OLMo-2-7B-Instruct and Mistral-Nemo-12B-Instruct on avg
💥 Competitive with LLama-3.1-8B-Instruct, beats it in math benchmarks and IF Eval

tscholak.bsky.social @tscholak.bsky.social · Apr 11

We're releasing:
🧠 Apriel-5B-Base: pretrained, general-purpose decoder
🧑‍🏫 Apriel-5B-Instruct: chat-style variant for aligned outputs
Trained on 4.5T+ tokens.
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Base
👉 huggingface.co/ServiceNow-AI/Apriel-5B-Instruct

1 1

tscholak.bsky.social @tscholak.bsky.social · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨
Speed ⚡ + Accuracy 📈 + Efficiency 💸
This model punches above its weight, beating bigger LLMs while training on a fraction of the compute.
Built with Fast-LLM, our in-house training stack.
🧵👇

1 2 4