Hamza M
@hpcwrangler.com
360 followers 77 following 520 posts
HPC Wrangler | Founder & CEO @ HMx Labs
Posts Media Videos Starter Packs
hpcwrangler.com
The Nvidia DGX Spark is NOT as supercomputer. No more than your iPhone is.

I wish reviewers would stop calling it that. I can't even take you seriously if start with that premise.

Yea I said it.

#HPC #Supercomputing
hpcwrangler.com
Not even from the AWS console.

Other regions work fine (same code). Not changed anything on our end.

Anyone seen anything like this before? Any ideas?
hpcwrangler.com
I'm have a weird problem on #AWS but only in eu-west-1 and eu-west-2.

We have some code that automates VM instance creation and then connects to the instance via SSH. Was working fine globally last week.

Now, any VM created in eu-west-1 or eu-west-2 is unresponsive. Can't SSH in to it.
hpcwrangler.com
If we're going to redo DC power cabling all over again why not do it right.

Wouldn't it be cool to have a PoE style protocol but with wiring and connectors gauged for 1MW? Plug in a single cable into a rack. If the protocol negotiates everything is fine you get power, else it stays off.
forbiddenunix.com
800V racks baby!

“Up to 5% improvement in end-to-end power efficiency

Maintenance costs reduced by up to 70% due to fewer PSU (power supply unit) failures and lower labor costs for component upkeep

Lower cooling expenses from eliminating AC/DC PSUs inside IT racks”

#hpc #cloud #ai
techjournalism.bsky.social
NVIDIA and 20 of its largest customers are preparing to r change how they power GPU clusters, enabling more efficient & powerful AI factories.

At OCP, NVIDIA says CoreWeave, Lambda, Nebius and Oracle are now designing for 800 VDC data centers.

datacenterrichness.substack.com/p/ai-factori...
hpcwrangler.com
Haha... I'm not sure if you're joking but you're probably right either way in that's what they'll aim for! 😂
hpcwrangler.com
So that's a total of 26GW of new capacity for just OpenAi announced in just the last 3 weeks or so....

right....
hpcwrangler.com
Cheaper (thinner) cables too.... but i reckon no one will touch a rack anymore unless the whole thing is powered down...so maintenance and down times might not improve by as much as projected.

will need DC-DC conversion at the rack level though surely and that's not particularly efficient.
hpcwrangler.com
no 😞 I'm out here slumming it

😂
hpcwrangler.com
Can anyone tell me where I can find some HBv5?

It allegedly exists according to this:
learn.microsoft.com/en-us/azure/...

but is rather absent from here:
azure.microsoft.com/en-gb/explor...

not sure if there's an az CLI command that could find it without listing all VM types in every region...
hpcwrangler.com
Ah well we can't have that! Take my jet instead. No jacuzzi i'm afraid but i'm sure you'll make do!
hpcwrangler.com
Shouldn't Microsoft be flying you there on a private helicopter? 😁
hpcwrangler.com
Its the deal structure that gets me.... looks like another infinite money glitch type affair.

Also I think it ends up giving Nvidia some ownership of AMD... assuming OpenAI actually hold the AMD shares that whole thing isn't just a funding trick (which it probably is).
hpcwrangler.com
Thanks James. I really enjoyed working with you too!

This paper was a lot of fun and I think we’re not quite done with this topic… more to come with other CPUs

Oh and we tangle with the beast that is numerical stability too 😁
hpcwrangler.com
Also, given hyper scalers all run CPUs you and I can’t buy anyway… does it matter?

If Big Cloud Co gets CPUs that are 5% than what you can actually buy doesn’t that offset any hypervisor tax anyway?
hpcwrangler.com
You think the idea for this would be to replace hypervisors and allow multi tenant workloads on bare metal? Sounds iffy.

Whats the real overhead of a hyper visor these days?
hpcwrangler.com
That's brilliant. So well articulated!
hpcwrangler.com
This. In Spades.

Depends on how you're measuring cost effectiveness but you'll get students to understand what's going on a heck of a lot faster when they can see it and not have to imagine a bunch of VMs and what's in them.
Reposted by Hamza M
suhaibkhan.bsky.social
Beyond the water that cools the servers, data centers indirectly contribute to water use through the electricity generation needed to power their operations.

That indirect use often makes up 80%+ of the overall water use.

spectrum.ieee.org/ai-water-usage

#HPC #AI @spectrum.ieee.org
How Much Water Do AI Data Centers Really Consume?
There's a lot of confusion about how much water is used by AI. Get the real story on data centers' water consumption, and read about tech for doing better.
spectrum.ieee.org
hpcwrangler.com
We didn't actually order and build the cluster, but some basic maths before purchase showed the same thing when we put together our test rig.

cloudhpc.news/new-beowulf-...

That was without even spending 1k on a blade kickstarter bundle!
New Beowulf Cluster Time
We need a new test HPC cluster at HMx Labs. Time for some fun with hardware.
cloudhpc.news
hpcwrangler.com
Come on man. We need somewhere safe for all those poor destitute bits and FLOPs. Won't someone think of the bits! 😁