WEB DESK – On Monday, OpenAI announced the release of GPT-4o, a new AI model capable of realistic voice conversations and interactions across text and image. This move aims to keep OpenAI ahead in the competitive race to dominate emerging AI technology.
Advanced Voice Capabilities
GPT-4o’s new audio features enable users to engage in real-time, natural conversations with ChatGPT, including the ability to interrupt while it is speaking. OpenAI researchers demonstrated these capabilities at a livestream event, highlighting the model’s ability to read bedtime stories in various voices, emotions, and tones.
Multimodal Interactions
In another demonstration, GPT-4o showcased its vision capabilities by solving a math equation written on a sheet of paper. The model also impressed with its real-time language translation abilities.
Enhanced User Experience
Paid users of GPT-4o will enjoy greater capacity limits compared to current paid users, according to OpenAI’s Chief Technology Officer, Mira Murati. This enhancement aims to provide a more seamless and efficient user experience.
Rapid Growth and Future Plans
ChatGPT, launched in late 2022, quickly became the fastest application to reach 100 million monthly active users. Despite fluctuating web traffic, it is now returning to its May 2023 peak, according to analytics firm Similarweb. OpenAI plans to enhance ChatGPT’s capabilities further, including giving it search engine-like access to up-to-date web information, which remains a challenge for the current iteration.
Market Reaction
OpenAI’s announcement came a day before Alphabet’s (GOOGL.O) annual Google developers conference, where new AI features are expected. Following the news, Alphabet shares dropped by 0.3%, after an earlier decline of nearly 3%, while Microsoft shares remained flat.