Anhinga Anhinga
banner
anhinga-anhinga.bsky.social
Anhinga Anhinga
@anhinga-anhinga.bsky.social
Non-standard neural machines and methods
modded-nanogpt: reducing the need for training data and training time an order of magnitude (training GPT-2-small is now taking only 5 min on 8xH100, requiring 1B tokens instead of 10B):

x.com/karpathy/sta...

github.com/KellerJordan...
GitHub - KellerJordan/modded-nanogpt: NanoGPT (124M) in 5 minutes
NanoGPT (124M) in 5 minutes. Contribute to KellerJordan/modded-nanogpt development by creating an account on GitHub.
github.com
November 29, 2024 at 3:19 AM