OpenAI unveiled GPT-4o, a new flagship it claimed was much faster than the previous version, with improved capabilities across text, video and audio.
In a livestream, CTO Mira Murati stated GPT-4o brings the intelligence of its product to the company’s free users for the first time, but noted paying customers would still have up to five-times more capacity.
CEO Sam Altman stated the model is “natively multimodal”, capable of generating content or understanding commands in voice, text and images.
The o in GPT-4o stands for omni.
Murati said the latest version will also have the memory capability to learn from previous conversations, which make it similar to an AI assistant.
“This is the first time that we are really making a huge step forward when it comes to the ease of use,” she explained. “This interaction becomes much more natural and far, far easier”.
She noted GPT-4o is twice as fast as GPT-4 Turbo at half the cost. It can understand 50 different languages while providing improved speed and quality.
It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which the company stated is like human response times in a conversation.
The GPT-4o features will be rolling out over the coming weeks.
Comments