The project started as I was playing with
Stas Bekman's MAMF script and looking at the logs. I realized the script logs a ton of VERY USEFUL data and given my Modal credits were expiring soon, I just decided to spend them all on getting as many numbers as possible!
The project started as I was playing with
Stas Bekman's MAMF script and looking at the logs. I realized the script logs a ton of VERY USEFUL data and given my Modal credits were expiring soon, I just decided to spend them all on getting as many numbers as possible!
There are still lots of things to be done and I'd love to cover more shapes, dtypes and pytorch versions in the future given access to more GPUs
There are still lots of things to be done and I'd love to cover more shapes, dtypes and pytorch versions in the future given access to more GPUs
The measurements are made using Stas Bekman's super helpful MAMF finder script (couldn't find his bsky handle to tag)
The measurements are made using Stas Bekman's super helpful MAMF finder script (couldn't find his bsky handle to tag)
• Capacity planning for real workloads
• Comparing GPUs on identical shapes/dtypes
• Identifying performance cliffs and bottlenecks
• Tracking regressions across PyTorch versions (WIP)
• Capacity planning for real workloads
• Comparing GPUs on identical shapes/dtypes
• Identifying performance cliffs and bottlenecks
• Tracking regressions across PyTorch versions (WIP)
It lets you explore Maximum Achievable Matmul FLOPS (MAMF) across matrix shapes, dtypes, and hardware.
MAMF is a practical upper bound on matmul throughput for a given GPU + software stack, not a theoretical peak.
It lets you explore Maximum Achievable Matmul FLOPS (MAMF) across matrix shapes, dtypes, and hardware.
MAMF is a practical upper bound on matmul throughput for a given GPU + software stack, not a theoretical peak.