OpenAI has new voice models that reason, translate, and transcribe as you speak
OpenAI has just released three new realtime voice models that it says will “unlock a new class of voice apps for developers.” Each new voice intelligence model has a unique speciality for different purposes.
Developers can build new app experiences with OpenAI’s 3 new voice models
There are three new OpenAI voice models for different purposes, including reasoning, translation, and transcription.
Here’s what the company announced today:
- GPT‑Realtime‑2, our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally.
- GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.
- GPT‑Realtime‑Whisper, a new streaming speech-to-text that transcribes speech live as the speaker talks.
OpenAI explains in more detail what’s new with the GPT-5-class GPT-Realtime-2 voice model with reasoning:
GPT‑Realtime‑2 is built for live voice interactions where the model keeps the conversation moving while it reasons through a request, calls tools, handles corrections or interruptions, and responds in a way that fits the moment.
Meanwhile, the new translation voice model supports “70 input languages and 13 output languages,” the company says.
Lastly, there’s the realtime transcription model:
GPT‑Realtime‑Whisper is a new streaming transcription model built for low-latency speech-to-text. It transcribes audio as people speak, so live products can feel faster, more responsive, and more natural—from captions that appear in the moment, to meeting notes that keep up with the conversation.
All three new voice models are included in OpenAI’s Realtime API, the company says, with this pricing:
- GPT‑Realtime‑2 is priced at $32 / 1M audio input tokens ($0.40 for cached input tokens) and $64 / 1M audio output tokens.
- GPT‑Realtime‑Translate is priced at $0.034 per minute.
- GPT‑Realtime‑Whisper is priced at $0.017 per minute.
You can test the new realtime voice models in the Playground. If you have Codex installed, click submit on the prompt below to add GPT‑Realtime‑2 to your existing app or create a new app with it.
You can learn more about OpenAI’s latest voice models and how companies are already using the new technology here.
Схожі новини
"Развод" / The Divorce: новый роман авторки "Служанки" Фриды Мак-Фадден экранизируют
‘SNL’ Promo: Matt Damon Likes Apples