In Short:
In testing ChatGPT’s new Advanced Voice Mode, the experience was fun but inconsistent. Sometimes it entertained, even switching to Spanish unexpectedly; other times it produced unhelpful responses. OpenAI initially restricted access due to safety concerns but later resumed a limited rollout. Key features, like screen sharing and AI singing, are not yet available, with a wider release planned for fall.
In an exploration of OpenAI’s Advanced Voice Mode, I engaged with the feature as an ambient AI companion while writing this article. Midway through the experience, ChatGPT surprised me by switching to Spanish, a spontaneous interaction that underscored the feature’s playful nature.
My experience with ChatGPT’s new audio capabilities during the alpha testing phase revealed a mixture of engaging and challenging interactions. It is important to note that the functionality available in the alpha test represents only a portion of what OpenAI showcased during the launch of the GPT-4o model earlier this year. Notably, the vision capability promised in the initial demonstration has been rescheduled for a future release, and the revamped Sky voice, associated with actress Scarlett Johansson, has been removed from this mode.
The current iteration of the Advanced Voice Mode appears to echo the release of the original text-based ChatGPT in late 2022. While some interactions lead to uninspired responses or clichéd AI statements, others exhibit a fluidity that distinguishes it from conventional virtual assistants like Siri or Alexa. This dynamic creates a conversational experience that feels enjoyable enough to share with family during holiday gatherings.
Access to the Advanced Voice Mode was briefly granted to a selection of reporters following its initial announcement, but it was suspended the next day due to safety concerns. After a two-month hiatus, OpenAI soft-launched the feature to a limited audience and released the system card for GPT-4o, detailing the company’s red teaming efforts, identified safety risks, and mitigation strategies.
Plans for Full Rollout
b>OpenAI has made Advanced Voice Mode available to select ChatGPT Plus users since late July, but the alpha group remains limited in size. The company aims to extend access to all subscribers later this fall. A representative from OpenAI, Niko Felix, did not provide further specifics regarding the timeline for broader rollout.
While screen and video sharing features were part of the original demo, they are not currently included in the alpha version. OpenAI plans to add these capabilities, although a timeframe for their release has not been established.
Upon approval for the Advanced Voice Mode, ChatGPT Plus subscribers will receive an email notification. Users can toggle between Standard and Advanced modes in the app once the voice feature is enabled. I conducted my testing on both an iPhone and a Galaxy Fold device.
Initial Impressions of the Advanced Voice Mode
During my initial hour with the Advanced Voice Mode, I discovered a newfound tendency to interrupt ChatGPT. This functionality, although not representative of typical human conversation, introduces a level of interactivity that significantly enhances the user experience.
Some early adopters may find the current version of Advanced Voice Mode disappointing due to its increased restrictions compared to the original demonstrations. Notably, although generative AI singing featured prominently in the launch presentations—complete with serenading lullabies and vocal harmonizations—these aspects are notably absent in the current alpha version.