OpenAI has introduced three new real-time audio models: GPT-Realtime-2 for conversations between humans and AI, GPT-Realtime Translate for live translation of human conversations, and GPT-Realtime Whisper for low-latency transcription. GPT-Realtime-2 offers real-time reasoning with GPT-5, a larger 128,000-token context window, adjustable reasoning levels, and parallel tool calls.
According to OpenAI, the models sound more natural, handle interruptions better, and deliver stronger benchmark performance. Translate supports more than 70 input languages and 13 output languages. Whisper is designed for use cases such as meetings, streaming, customer support, healthcare, and retail. Prices remain unchanged, and data is stored in the EU. Only tracing, meaning the tracking of API calls, is not yet compliant with EU Data Residency.