model in 1 epoch of the fantastic @servicenowresearch.bsky.social R1-Distill-SFT dataset. It trained for about 100 hours on a single A100.
model in 1 epoch of the fantastic @servicenowresearch.bsky.social R1-Distill-SFT dataset. It trained for about 100 hours on a single A100.
#DeepSeekR1 #LLMs #AI
#DeepSeekR1 #LLMs #AI