At its DevDay 2024 developer event, OpenAI will launch its new Realtime API. It supports natural speech conversations with six different voices.
During the OpenAI developer event in San Francisco, the AI company revealed four major updates to its application programming interface (API) for developers. One of the most important is the Realtime API. This API supports natural speech-to-speech conversations based on six different voices. It is available in public beta for developers.
Real-time API
OpenAI will unveil four new APIs for developers at its DevDay 2024 developer event, the most important of which is the Realtime API. This API supports natural speech-to-speech conversations based on six predefined voices. This allows developers to build features into their applications, similar to ChatGPTs Advanced sound mode. This API is available in public beta.
According to OpenAI, the Realtime API can simplify the process of creating voice assistants. Initially, developers had to use different models for speech recognition, text processing, and text-to-speech. And with the new API, they can handle this entire process in one go.
Additionally, OpenAI offers two new APIs that help developers balance performance and cost when building AI applications. Model Distillation allows developers to improve smaller models based on the output of more advanced models. Additionally, prompt caching can speed up reasoning by remembering frequently used prompts. Finally, “Vision Fine-tuning” allows developers to customize GPT-4o by providing custom images and text.
Developer event
The annual OpenAI developer event was held in San Francisco on Monday. This event is by invitation only. OpenAI CEO Sam Altman chose to take a global approach this year. The event is held in multiple locations and lasts only one day. The next locations are London (October 30) and Singapore (November 21).