Our Take
Google is fragmenting its premium tier into a budget-conscious developer tier and a full-featured tier; the company's move away from daily limits to compute-based quotas matters more than the pricing.
Why it matters
The shift from daily prompt caps to compute-metered limits directly affects how developers budget for API usage. Agents and video tasks now cost against actual compute, not a flat daily counter—pricing signals are clearer, but spending unpredictability increases.
Do this week
Developers: audit your current Gemini app and Google Antigravity usage patterns against the new five-hour compute-reset window before migrating to $100 Ultra, so you understand whether you hit weekly caps under the old daily model.
Google introduces two AI Ultra tiers and shifts to compute-based metering
Google announced a $100/month AI Ultra plan alongside a price cut on its existing $250/month tier, now $200/month (company-reported). The $100 tier targets developers and technical leads with a 5X usage limit on Gemini and Google Antigravity; the $200 tier provides 20X the usage limit of the Pro plan, plus access to experimental tools like Gemini Spark (a personal agent for Google products) and Project Genie (a generative world-building prototype).
Both tiers include Gemini Omni (a multimodal model handling video, image and text inputs), Gemini 3.5 Flash (optimized for agents and coding), and 20TB cloud storage. YouTube Premium is bundled into the $200 plan and YouTube Premium Lite is rolling out to AI Pro subscribers in select countries.
The billing change is substantive: Google is moving away from daily prompt limits to a compute-based model that refreshes every five hours. Complex video or coding prompts consume more of a user's weekly quota than simple text queries. If a subscriber exhausts limits on flagship models, Google shifts them to faster, smaller models automatically. Pay-as-you-go credits are available for overage on Google Antigravity, Google Flow, and the Gemini app.
AI Inbox in Gmail (triage and draft replies) is rolling out to AI Plus and Pro subscribers after an Ultra-only launch. Daily Brief in the Gemini app (a morning digest pulling from Gmail, Calendar and chat) launches this week for all three tiers in the U.S.
Compute metering removes pricing opacity but creates new friction
The prior daily-limit model was easy to understand and hard to exceed; you hit a ceiling and stopped. Compute metering introduces fairness (a five-minute video analysis shouldn't cost the same as a three-word query), but also unpredictability. A developer cannot simply count prompts; they must estimate compute cost by feature type and input length.
The $100 entry point is significant: it's $1,200/year for developers who want Gemini 3.5 Flash and five-hour quota resets without committing to $200/month. However, the automatic fallback to smaller models when quotas reset may surprise users mid-session and disrupt workflows that depend on consistent model behavior.
The addition of Gemini Spark and Project Genie to the $200 tier positions Google as building agentic features (personal agents that take actions on behalf of users) faster than competitors are releasing production agents. Both remain experimental; availability is staged (Spark to trusted testers this week, beta next week for U.S. Ultra subscribers; Genie rolling out to eligible $200 subscribers globally).
Audit quota reset timing and model fallback behavior
If you run production workflows on Gemini or Google Antigravity, map your current token/prompt counts to the new compute-based quotas before the five-hour window takes effect. Test the automatic downgrade to smaller models in non-critical environments to confirm it does not break latency or quality expectations for your use case.
For teams considering the $100 plan, confirm that Gemini 3.5 Flash (not a larger model) meets your performance requirements; you will land on it after quota reset regardless of preference. If your workflows require consistent large-model access, the $200 tier is the only stable option.