Google Scholar: https://scholar.google.com/citations?user=GDm6BIAAAAAJ&hl=en
@karpathy.bsky.social’s nanoGPT, I worked to bring this concept to the HPC world!
I’ve built a minimal implementation of an MPI library called nanoMPI, which focuses on clarity, simplicity, and easy installation.
@karpathy.bsky.social’s nanoGPT, I worked to bring this concept to the HPC world!
I’ve built a minimal implementation of an MPI library called nanoMPI, which focuses on clarity, simplicity, and easy installation.
- rocm.blogs.amd.com/ecosystems-a...
- www.zyphra.com/post/trainin...
- rocm.blogs.amd.com/ecosystems-a...
- www.zyphra.com/post/trainin...
- Zamba2 models of size 1.2B, 2.7B, 7.4B
- Zyda-2 5T token dataset
- We discuss more specifics on model arch, training process, dataset creation, etc
Links:
- Zamba2: arxiv.org/abs/2411.15242
- Zyda-2: arxiv.org/abs/2411.06068
- Zamba2 models of size 1.2B, 2.7B, 7.4B
- Zyda-2 5T token dataset
- We discuss more specifics on model arch, training process, dataset creation, etc
Links:
- Zamba2: arxiv.org/abs/2411.15242
- Zyda-2: arxiv.org/abs/2411.06068