VoiceTech is finally starting to come into its own.
And it’s all thanks to the increasingly multi-modal capabilities of generative artificial intelligence (AI).
This, as a drumbeat of recent AI-powered VoiceTech announcements are underscoring the fact that the oldest form of engagement is now becoming more feasible.
Amazon announced updates to its Alexa smart speaker last week (Sept. 20) to make the conversational platform — which is linked to almost a billion devices — less clunky, less robotic and more capable of intuitive comprehension when responding and acting on user queries and requests.
Also last week (Sept. 18), Apple released its latest iOS, which features generative AI integrations meant to improve the conversational flow with its own voice assistant, Siri, as well support more languages for mixed-language conversations.
And on Monday (Sept. 25), the generative AI pioneer OpenAI introduced a suite of new capabilities for its ChatGPT that will roll out over the next few weeks and include the ability to interact with the AI model using conversational voice commands, a shift from prior text-based engagements.
Also on Monday, Spotify released a new AI feature that can translate podcasts into different languages using the host’s own voice. While not directly conversational, the innovative capability highlights the rapid advances being made within the realm of voice, thanks to generative AI.
But does this all mean that conversational commerce is finally on the cusp of becoming a reality?
It just might.
From enhanced customer experiences to increased efficiency and accessibility, the ability to interact with devices and applications through natural language could revolutionize the way users shop, seek information and manage their daily tasks.
Voice AI is not limited to a single device or platform. It can be integrated into various gadgets, such as smartphones, smart speakers and even cars. This interoperability allows consumers to enjoy a consistent experience across different devices, making their lives more convenient.
According to recent PYMNTS Intelligence in the report “How Consumers Want to Live in the Voice Economy,” 54% of consumers said they would prefer conversationally capable voice technology to typing or using a touchscreen because it is faster and more convenient.
Consumers can use voice commands to control smart home devices, play music, or even order groceries, all from a single platform. This versatility enhances user satisfaction and fosters brand loyalty — which could be why Amazon announced on Monday an investment of up to $4 billion into AI startup Anthropic.
As PYMNTS wrote earlier this month, most voice assistants today still struggle to move beyond a core set of applications like playing music, turning lights on and off, telling their owners the weather or stock prices, and relaying other information directly from a website. Even voice-activated connected commerce has yet to be fully scratched by today’s platforms.
Voice AI bears a strategic advantage for tech companies. Through users regularly turning to voice interface-driven smart assistants for everyday tasks, companies can collect more user data, allowing them to create more conversions and sales via relevant and real-time personalization.
The attraction is a two-way street. PYMNTS Intelligence found that 63% of consumers say they would use voice if it were as capable as a person, 58% would use voice because it is easier and more convenient than doing tasks manually and 54% would also use it because it is faster than typing or using a touchscreen.
PYMNTS Intelligence also found that fewer than one in 10 consumers (7.8%) see voice assistants as capable enough to make a meaningful difference in their lives today.
“[When voice AI first started,] consumers wanted to have those sci-fi-style, open-ended conversations [with robots], and many were disappointed because the tools at that time could only play music, set timers, tell you the weather,” Keyvan Mohajer, CEO and co-founder of conversational intelligence platform SoundHound, told PYMNTS, adding that these limited “utility occasions” had the unfortunate side effect of causing consumers to lower their expectations around voice AI applications.
Fortunately, a majority (60%) believe that voice assistants will one day be as conversationally reliable as another human.
And that, after all, is the whole value proposition.