OpenAI adds GPT-5-level voice model with real-time translation

OpenAI ships three voice models with GPT-5 reasoning

OpenAI released GPT-Realtime-2, a voice model that includes GPT-5-class reasoning for handling complex conversational requests. The company positions this as an upgrade from GPT-Realtime-1.5, though it provided no specific performance comparisons.

Two additional models launched alongside: GPT-Realtime-Translate offers real-time translation across 70+ input languages and 13 output languages, while GPT-Realtime-Whisper provides live speech-to-text transcription during ongoing conversations.

All three models integrate into OpenAI's Realtime API. Translation and transcription services bill by the minute, while GPT-Realtime-2 uses token-based pricing (per company announcement).

Voice AI moves beyond call-and-response patterns

Current voice systems typically handle simple queries but struggle with multi-step reasoning or contextual follow-ups. OpenAI claims its new models can "listen, reason, translate, transcribe, and take action as a conversation unfolds" rather than just responding to individual prompts.

The 70-language input capability addresses a significant enterprise need. Most existing real-time translation services cover fewer languages or require separate transcription steps, creating latency issues for live conversations.

Customer service represents the obvious application, but educational platforms and creator tools could benefit from voice interfaces that maintain context across longer interactions.

Evaluate reasoning claims against your use cases

OpenAI built guardrails to prevent spam and fraud applications, with automated conversation halting when content violates their guidelines. However, the company shared no specifics about false positive rates or appeal processes.

The lack of independent benchmarks makes it difficult to assess actual improvements over GPT-Realtime-1.5 or competing voice models. Teams should test reasoning capabilities directly against their specific conversation patterns rather than assuming GPT-5-class performance translates to voice interactions.

Token-based billing for the reasoning model could create cost unpredictability compared to minute-based alternatives. Plan pilot tests that track both token consumption and conversation quality before broader deployment.

OpenAI adds GPT-5-level voice model with real-time translation

Our Take

Why it matters

Do this week

OpenAI ships three voice models with GPT-5 reasoning

Voice AI moves beyond call-and-response patterns

Evaluate reasoning claims against your use cases

Related stories

Gresham and FundGuard merge data platforms for asset managers

ANNA Money adds 3.66% savings account for UK small businesses

Payward buys Reap for $600M to merge stablecoin cards with B2B rails