I reckon electrical activity across neurons is only one piece of a large puzzle
I reckon electrical activity across neurons is only one piece of a large puzzle
1 prompt takes around 1 second of H100 time at 70% which the authors state need ~1k watts
With batching maybe things are 5x slower (5s per prompt) so 200watts over 5 seconds.
So ballpark 200watts during inference? Maybe some baseline to extend to long running agents.
1 prompt takes around 1 second of H100 time at 70% which the authors state need ~1k watts
With batching maybe things are 5x slower (5s per prompt) so 200watts over 5 seconds.
So ballpark 200watts during inference? Maybe some baseline to extend to long running agents.