Google processes 3.2 quadrillion tokens monthly as Gemini 3.5 Flash ships

Google ships Gemini 3.5 Flash with 4x faster output than rivals

Google released Gemini 3.5 Flash today, positioning it as a frontier-capable model that runs four times faster than competing models while costing less than half the price of other frontier options (per Google). The company claims the model outperforms Gemini 3.1 Pro across most benchmarks, with substantial gains in coding tasks and on GDPVal, a metric capturing real-world economically valuable workflows.

Token processing inside Google has accelerated sharply. In March, Google processed 500 billion tokens daily across internal AI developer tools. That figure has since doubled multiple times: the company now processes over 3 trillion tokens per day internally. Across all surfaces (products, APIs, enterprise customers), Google reports processing 3.2 quadrillion tokens monthly—a sevenfold increase from the prior year's 480 trillion.

The Gemini app has grown to 900 million monthly active users, more than double the 400 million reported a year ago. Daily requests within the app have grown over seven times in that period. Search-based features including AI Overviews (2.5 billion monthly active users) and AI Mode (1 billion monthly active users) continue to expand user engagement.

Google also announced new conversational features rolling out across products: Ask YouTube (available in U.S. summer testing), voice-powered Docs Live for Google Docs subscribers (summer), and expanded voice capabilities coming to Gmail and Keep. The company detailed infrastructure investments including its eighth-generation TPU chips (TPU 8t for training, 8i for inference) and distributed training across over 1 million TPUs globally. Google's annual capex is expected to reach $180 to $190 billion this year, up from $31 billion in 2022.

On AI safety, Google announced that OpenAI, Kakao, and Eleven Labs have adopted SynthID, Google's invisible watermark for AI-generated media. SynthID has watermarked over 100 billion images and videos to date. Content Credentials verification, showing whether content originated from a camera or AI, will roll out to Search and Chrome.

Token volume growth does not prove product value

The 3.2-quadrillion-token headline is a supply-side metric: it measures what Google processes, not what customers extract from it. High token volume can signal strong adoption or inefficient consumption. Without customer-reported cost savings or latency improvements, the figure remains a throughput vanity metric.

Gemini 3.5 Flash's speed advantage is real: four times faster output than frontier competitors is measurable and matters for latency-sensitive applications. The price-to-capability ratio, if confirmed in independent benchmarks, could reduce total cost of ownership for high-volume users. Yet Google's own internal token consumption spike (500 billion to 3+ trillion daily in months) suggests that cheaper, faster inference may simply increase token burn rather than shrink it. Enterprises already blowing through annual budgets by May need to know whether Flash solves budget constraints or enables overspending.

The new conversational products (Ask YouTube, voice Docs) address real friction: video navigation and hands-free document creation are defensible use cases. But no adoption metrics for these early-stage features are yet public. User growth in the Gemini app and Search features shows engagement, not necessarily the economic value of agentic workflows.

Verify Flash costs before committing infrastructure

Test Gemini 3.5 Flash on your highest-volume inference workloads in a staging environment and measure tokens-per-task and end-to-end latency before shifting production traffic. Compare the reported 50% price reduction against your actual token consumption under realistic load; token efficiency gains often disappear under sustained real-world demand. If you are currently processing 1 trillion tokens daily, model the $1 billion annual savings claim with your own pricing contract (list rates vary by tier) and confirm with a cost-per-output-token calculation before budget approval.

Google processes 3.2 quadrillion tokens monthly as Gemini 3.5 Flash ships

Our Take

Why it matters

Do this week

Google ships Gemini 3.5 Flash with 4x faster output than rivals

Token volume growth does not prove product value

Verify Flash costs before committing infrastructure

One daily brief. Every story gets a hype verdict.

Related stories

Fenergo hires Finastra CRO to lead global revenue expansion

UK banks have 18 months to map third-party risks under PS26/2

Quantifind Lands $200M to Scale AI-Native Financial Crime Detection