AI's true cost emerges as vendors end subsidies, banks hunt cheaper options

The subsidy era ends, real prices arrive

Tech vendors including Anthropic, OpenAI, and Microsoft are abandoning flat per-seat licensing and switching to per-token pricing. This shift exposes what was always true but hidden: running AI at scale is expensive, and vendors were absorbing losses to drive adoption.

Under the old model, companies paid a fixed rate regardless of usage. This encouraged what insiders call "tokenmaxxing." Under the new model, cost scales with consumption. The result is immediate: some enterprise customers have exhausted a full year's AI budget in months. PNC, for instance, is now building its own AI infrastructure in-house to reduce per-token exposure.

The pricing opacity is beginning to clear. A startup called Ornn is publishing token price indices derived from executed GPU transactions. A second firm, IFX, is building derivatives products on top of that data, allowing traders to hedge compute costs like any commodity.

The math is tightening on vendors and creditors alike

The AI buildout is on track to consume roughly $2 trillion in capital (per analyst Azeem Azhar). Early spending came from hyperscalers themselves, but increasingly, the money is borrowed from banks and private credit. That debt carries a demand: profitability.

Azhar estimates that to achieve a 25% return on $8 billion in annual operating costs for 1 gigawatt of AI capacity, vendors need to charge between $1.05 and $2.10 per token. As of recent data (per Ornn), H100 GPU pricing sits at $2.45 per token, which means vendors are operating at thin or negative margins once overhead is included.

This creates a two-way squeeze. Vendors cannot subsidize indefinitely without disappointing lenders. Customers paying real prices now have a rational incentive to defect to cheaper alternatives: open-source models, older but functional models, or internal infrastructure. Each defection shrinks the customer base vendors need to justify their capital raise.

Audit usage, plan alternatives, lock costs now

Banks and enterprises should treat token pricing as a variable cost that demands monthly governance, not a fixed IT line item. Nathan Place's reporting (per American Banker) documents that companies are already pursuing three cost-reduction strategies: migrating to open-source models, reverting to older models that still deliver value, and building proprietary inference layers.

If your organization has not yet migrated to per-token billing, the transition is coming. If you have, you need a monthly consumption audit against budget and a written decision on whether to invest in in-house inference, negotiate multi-year fixed rates, or shift workloads to cheaper model architectures. The subsidy window is closed. The only variable left is how quickly you adapt.

AI's true cost emerges as vendors end subsidies, banks hunt cheaper options

Our Take

Why it matters

Do this week

The subsidy era ends, real prices arrive

The math is tightening on vendors and creditors alike

Audit usage, plan alternatives, lock costs now

Related stories

Non-observable states cut Markovian bandit regret near-logarithmic

New method lets you interpret protein AI models without exploding feature counts

Darts Adds Four Foundation Models in One Interface