(NBIS), (AMZN), (GOOG)
You know who isn’t spending $5/hour to run an H100? A company you’ve probably never heard of called Nebius Group (NBIS).
While everyone else is throwing money at GPUs like it’s 2021 all over again, Nebius is quietly running sub-$0.03 racks at 90%+ utilization. I’ve toured their facilities. I’ve double-checked the numbers. It’s real—and it’s borderline ridiculous.
I spent the better part of last month wading through server rooms in three countries trying to understand what makes Nebius tick, and I’ve finally figured it out. It’s not just about the machines—it’s about the margins.
When you’re scaling GPU infrastructure in the AI gold rush, pennies become billions, and Nebius has found a way to shave those pennies better than anyone else.
You wouldn’t believe the scene at their Serbian facility. While everyone’s fretting over getting their hands on H100s at any price, these folks have engineered an operation where each GPU costs them mere pocket change to run—under $0.025 per GPU-hour. That’s not a typo. A quarter of a cent.
I had to double-check my notes when their CTO casually dropped this figure during our tour of their liquid-cooled racks, which are humming along with near-perfect utilization rates. It’s this kind of operational efficiency that translates directly to their aggressive pricing strategy.
The industry has been obsessing over CoreWeave’s $4.76/hour pricing for the H100 HGX, but Nebius is quietly offering the same chips at $3.15/hour for those willing to sign 12-month contracts.
My sources at two AI startups confirmed they’ve already jumped ship from AWS for this pricing alone. The math is simply too compelling to ignore when you’re burning through compute at AI-training scale.
This pricing advantage doesn’t materialize from thin air. What most investors haven’t grasped yet is the structural edge Nebius has engineered beneath the surface. They’ve poured nearly $2 billion in CapEx over three years building what amounts to the Formula 1 car of AI infrastructure—proprietary data centers with custom cooling systems that lower operational costs by up to 30% over five years.
One executive who requested anonymity told me, “We’re essentially running the hyperscaler playbook without hyperscaler overhead.”
This capital-intensive approach reveals a fascinating long-term strategy: while Nebius minimizes gross margins through their ODM approach to around 2%, this positions them brilliantly for the long game.
When public pricing for H100s hovers at $2.00 per hour against their 2.5-cent operating costs, the margin potential becomes staggering once that initial investment is recouped.
An old hedge fund buddy of mine who’s been loading up on shares put it best: “They’re printing money at scale once the CapEx is absorbed.”
And it’s not just about cheap electricity in Serbia (though that certainly helps). The real moat is their thermal engineering.
Their custom-designed, liquid-cooled racks—developed by a team poached from a major European physics lab—let them run GPUs denser and cooler than anyone else. No throttling. No wasted space. Just pure, relentless compute per square foot. Every engineering decision compounds the cost advantages across the stack.
To be clear, not everything is champagne and cash flow. Their approach comes with long payback periods and enough geopolitical risk to make certain LPs sweat.
Their reliance on gray market GPU sourcing also raises eyebrows in an era of chip nationalism and tightening export controls. But these risks feel more like calculated gambles than reckless moves. Nebius knows exactly what tradeoffs it’s making—and why.
Which brings us to what might be their boldest move yet: sovereign AI. While AWS and Google (GOOG) fight over Fortune 500s, Nebius is carving out entire countries.
Their in-country deployment model and containerized LLM stack—complete with post-quantum encryption and localization for Cyrillic and Balkan languages—is winning government-adjacent clients at an impressive clip. In regions where digital sovereignty is non-negotiable, Nebius is delivering tailor-made infrastructure with just enough red-teaming to get through procurement.
It’s a classic land grab: go where the big guys won’t, lock in first-mover advantage, and scale the margins later.
This regional playbook couldn’t come at a better time. The GPU-as-a-service market is expected to hit $100 billion in the coming years, and while investors chase the usual suspects, Nebius is quietly building the rails underneath it all.
Their trajectory feels uncannily similar to AWS in the early 2000s—back when the cloud felt like a niche bet, not the juggernaut driving 60% of Amazon’s (AMZN) operating income.
Having watched multiple infrastructure cycles unfold over the decades, I’ve learned one thing: the people who build the rails win. And while most headlines are chasing the trains, Nebius has been laying steel in the dark.