Kazuki Fujii
banner
kazukifujii.bsky.social
Kazuki Fujii
@kazukifujii.bsky.social
Tokyo Tech CS Master (Rio Yokota Lab → Jun Sakma Lab) Distributed Training, Sytems for Machine Learning
Reposted by Kazuki Fujii
Releasing SmolVLM, a small 2 billion parameters Vision+Language Model (VLM) built for on-device/in-browser inference with images/videos.

Outperforms all models at similar GPU RAM usage and tokens throughputs

Blog post: huggingface.co/blog/smolvlm
November 26, 2024 at 4:58 PM
📢 New findings on FP8 training for Continual Pre-Training! 🚀
Our experiments on Llama-3-70B show that FP8 significantly boosts training throughput (415 → 570 TFLOP/s) but induces loss spikes, leading to downstream performance drops. FP8 isn't always the best choice—it depends! (1/n)
November 25, 2024 at 12:43 AM