OpenAI says it needs more time to prepare the “Voice Mode” for its latest AI model.

“We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch,” the artificial intelligence (AI) company said in a post on Twitter Tuesday (June 25).

“For example, we’re improving the model’s ability to detect and refuse certain content. We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses.”

The company plans to test the feature with a small group of users before a wider rollout this fall, pending safety and reliability checks. OpenAI had announced in May a model known as GPT-4o, which could carry out realistic voice conversations.

As noted here at the time, research by PYMNTS Intelligence has shown that the use of voice assistants has been steadily rising, with millions of people around the world depending on the technology for various tasks, such as setting reminders or controlling smart home devices.

It has led tech giants such as Amazon, Apple and Google to invest heavily in the creation of more advanced voice assistant technologies.

OpenAI’s GPT-4o wants to capitalize on this trend by providing enhanced ease of use and processing speed. The company claims this new model can hold real-time conversations and respond to user queries and requests with no noticeable delay.

Until now, the ChatGPT experience has primarily been a typical chat with typing prompts, Muddu Sudhakar, co-founder and CEO of generative AI company Aisera, said in an interview with PYMNTS.

“Then OpenAI added more capabilities, like images and voices,” he said. “But they were not seamless. Well, now they are. It looks like there has been a complete revamping of the ChatGPT form, where multimodal capabilities are native. This represents a major leap. It makes the user experience more natural and effective — like interacting with a human.”

Research last year from the PYMNTS Intelligence report “How Consumers Want to Live in the Voice Economy” found that consumers said they prefer voice technology because of its speed, ease of use, and convenience when compared to typing or using touchscreens, with some also viewing it as more secure.