On May 13, 2024, OpenAI introduced its latest flagship model, ChatGPT, where the "o" stands for "omni," highlighting its revolutionary native multimodal capabilities. This model can process and generate content by combining text, audio, and images, marking a significant step towards more natural and intuitive human-computer interaction.
GPT-4o delivers GPT-4 level intelligence but is significantly faster and more cost-effective. In the API, the model is 50% cheaper than GPT-4 Turbo and offers increased rate limits. More importantly, OpenAI has begun rolling out access to GPT-4o, including its text and vision capabilities, to users of the free ChatGPT tier, while ChatGPT Plus subscribers receive significantly higher message limits.
A key feature of GPT-4o is its ability to seamlessly process audio inputs and outputs. In the new voice mode, users can converse with ChatGPT as naturally as with a human: the model responds to audio inputs almost instantaneously (averaging 320 milliseconds, comparable to human reaction time), can perceive emotional nuances in the user's voice and generate voice in various emotional styles, and even laugh or sing. The model can also translate languages in real-time and understand when it's interrupted.
The visual capabilities of GPT-4o are also impressive. Users can upload images, screenshots, documents with text and graphs, and the model can analyze them, answer questions about the content, or even assist with tasks depicted in the picture. For example, it can help solve a math problem from a photo or explain code in a screenshot.
OpenAI also announced a new desktop ChatGPT application for macOS, allowing for easy integration of AI into the computer workflow, including the ability to ask questions via voice or screenshots. A Windows version is also planned.
Safety remains a priority for OpenAI. GPT-4o was developed using the latest techniques to mitigate risks and has undergone thorough testing. The audio capabilities of the new voice mode will be rolled out gradually, starting with alpha testing for a limited number of ChatGPT Plus users in the coming weeks.
The launch of ChatGPT opens new horizons for developers and users, making advanced AI technologies more accessible and interactive than ever before.