⚡100x Training Throughput
🎯Fast Convergence
🔢Pure Int8 Pretraining of RNN LLMs
⚡100x Training Throughput
🎯Fast Convergence
🔢Pure Int8 Pretraining of RNN LLMs
🔎 For a deeper dive into the theory:
blog.foersterlab.com/fixing-td-pa...
blog.foersterlab.com/fixing-td-pa...
See you in Singapore! 🇸🇬
By Oxford, @jfoerst.bsky.social
Paper: openreview.net/forum?id=wFg...
Video: www.youtube.com/watch?v=6fS7...
@danielrensch.chess.com
By Oxford, @jfoerst.bsky.social
Paper: openreview.net/forum?id=wFg...
Video: www.youtube.com/watch?v=6fS7...
@danielrensch.chess.com
#more_science_less_hype (please).
PS: Amazing discussion and good brain food, as usual with MLST.
#more_science_less_hype (please).
PS: Amazing discussion and good brain food, as usual with MLST.
@jfoerst.bsky.social @ferranalet.bsky.social @adamjelley.bsky.social @enjeeneer.io
Same deal for tomorrow: 7am at
goo.gl/maps/8Z8eMrd...
Join us!
@jfoerst.bsky.social @ferranalet.bsky.social @adamjelley.bsky.social @enjeeneer.io
Same deal for tomorrow: 7am at
goo.gl/maps/8Z8eMrd...
Join us!
@FLAIR_Ox
is coming up on the 2nd of December AOE. We work on compute-only scaling of LLMs, (meta/multi-agent) RL at the Hyperscale, Human-AI coordination, opponent-shaping for vaccine design, GenAI for finance & much more..
@FLAIR_Ox
is coming up on the 2nd of December AOE. We work on compute-only scaling of LLMs, (meta/multi-agent) RL at the Hyperscale, Human-AI coordination, opponent-shaping for vaccine design, GenAI for finance & much more..