Our Take
Huang is right to focus on Vera, but supply constraints and the fact that his biggest customers are already winning with their own chips make this a defensive move, not a growth story.
Why it matters
Inference is where Nvidia's GPU monopoly is weakest. Google, Amazon, and Microsoft are spending $700+ billion on AI infrastructure this year and pouring funds into custom silicon. Vera is Nvidia's answer to a threat that is already materializing.
Do this week
Infrastructure teams: audit your inference workload costs on current GPU stacks and model Vera pricing against in-house custom silicon business cases before committing to next-gen GPU contracts.
Nvidia positions Vera as a second-front business
During its Q1 earnings call, CEO Jensen Huang told analysts that Nvidia's new Vera central processors unlock access to a $200 billion market outside the $1 trillion the company has already forecast from its Blackwell and Rubin GPU lineup through 2027. Huang expects Vera chip revenue to reach $20 billion by the end of this fiscal year, positioning it as Nvidia's second-largest sales contributor (per company statements on the call).
The timing is strategic. Nvidia reported Q1 revenue of $81.62 billion, beating analyst estimates of $78.86 billion, and guided Q2 at $91 billion, well above Wall Street's $86.84 billion forecast (per earnings release). Despite the beats, Nvidia shares fell 1.6% in extended trading. Investors are focused on a different question: whether AI spending will sustain through 2027 and 2028, especially as the narrative shifts toward inference workloads.
Inference is where Nvidia's dominance is most exposed
Training large models remains firmly Nvidia territory. Inference, generating answers at scale in real time, is where custom chips are making their case. Google, Amazon, and Microsoft are collectively expected to pour more than $700 billion into AI infrastructure this year, up from around $400 billion in 2025. Simultaneously, they are funding custom silicon: Google's TPU line, Amazon's Trainium, and others are designed to serve models cheaply and at scale.
Vera, developed in part using technology from Groq (a startup specializing in inference that Nvidia licensed in a deal reportedly worth around $17 billion), targets exactly this workload. The full Vera Rubin platform, combining the Vera CPU with Rubin GPUs, is set to launch later this year.
But Huang was candid about a structural problem: supply. "My sense is that we'll be supply-constrained through the entire life of Vera Rubin," he said on the call. Nvidia disclosed that its supply commitments rose to $119 billion in Q1, up from $95.2 billion the previous quarter, a significant jump reflecting both confidence in demand and anxiety about a global memory chip crunch. The company also announced an $80 billion share repurchase and raised its quarterly dividend to 25 cents from 1 cent, signaling financial confidence even as supply tightens.
Treat Vera as a defensive hedge, not a lock
Huang pointed to a growing sub-segment of AI-specific cloud customers whose spend is now roughly equal to hyperscalers but growing faster quarter-over-quarter. The argument is that Nvidia will grow faster than hyperscale capex. But that case rests on Vera winning against entrenched custom silicon from vendors who control both the infrastructure and the workloads.
Supply constraints are a real risk. If Nvidia cannot ship Vera volumes at competitive cost and speed, the window for winning inference customers closes faster than the company can pivot. Customers evaluating Vera should demand multi-year supply guarantees and pricing locks, and should baseline Vera's end-to-end cost against their own custom silicon or existing GPU amortization before committing.