Now, businesses are looking to transform the AI economy.
Keyvan Mohajer, CEO and co-founder of conversational intelligence platform SoundHound, told PYMNTS CEO Karen Webster in a recent discussion that advances in voice AI, particularly around improved speech recognition and synergistic integrations with large language models (LLMs) like ChatGPT-4, are already unlocking increasingly valuable use cases with the potential to change, well, everything.
“[When voice AI first started] consumers wanted to have those sci-fi-style, open-ended conversations [with robots], and many were disappointed because the tools at that time could only play music, set timers, tell you the weather,” Mohajer said, adding that these limited “utility occasions” had the unfortunate side effect of causing consumers to lower their expectations around voice AI applications.
Now, as digital tools continue their exponential advance, voice AI solutions are finally able to provide consumers with the first promising glimpses of a truly open-ended conversational experience.
By activating multimodal data sets across contextual information domains using speech recognition interfaces, modern voice AI solutions and platforms can surface personalized information and respond in real time to complex queries, potentially ushering in a new era of conversational-driven commerce for both consumers and businesses.
“We are trying to combine the best of both the generative AI large language models, which can answer a lot of questions and handle real-time information, with more utility-driven features, and then arbitrate [that combination],” Mohajer said. “The utility domains are good for [direct] assistance, while the LLMs can answer complex questions in real time, so when they are combined there’s a real value proposition.”
He added that conversational AI is a “never ending project” that “will always get better” as the potential of better recognition and analysis continues to be realized.
PYMNTS research in “12 Months Of The ConnectedEconomy: 33,000 Consumers On Digital’s Role In Their Everyday Lives” finds that half of U.S. consumers have integrated at least one smart technology, including voice assistants, into their daily lives.
Mohajer emphasized that as behavioral norms and consumer expectations evolve to the point where they are commonly relying on connected voice assistants, the ability to handle complex conversations effectively and efficiently will be revolutionary.
He noted that while SoundHound’s voice AI platform could handle multifaceted queries as far back as 2015, it was limited to “power users.”
“Other voice assistants in effect trained consumers to lower their expectations and talk in short questions,” Mohajer said. “We always had to fight that wrong education to tell users they can open up and ask more complex things.”
Now with the rise of LLMs, he added, users are becoming more comfortable posing open-ended, longer questions to voice AI tools. “It’s a force in the right direction.”
PYMNTS research has found that 79 million U.S. consumers already use voice assistants to help manage daily chores and their connected homes.
As smart homes get smarter, consumers are growing increasingly accustomed to speaking commands they would have typed not long ago.
“Something I’ve been bringing up recently is that there is a clear intersection between the demand for adoption from consumers and a technology’s readiness,” Mohajer said. “These intersections are [often] rare … but with conversational AI, we have that intersection now.”
He added that as digital tools continue to improve and natural-language-processing capabilities increase, voice AI is set to become the next “disruptive” computational breakthrough.
“What deep neural networks did to machine learning, these large language models will do to conversational AI — you can build experiences faster and with higher quality, [all while] using fewer engineers and resources. It will surpass people’s expectations of what can be done,” Mohajer said.
What’s most exciting, he noted, is the idea of a “collective AI interface” where users can interact across domains that feed into each other.
“Our plan is to go after everything,” Mohajer said, adding that as it relates to voice AI and commerce, his plan is to continue building a conversational AI architecture that, at least strategically, is similar in approach to Amazon’s — where the eCommerce giant started off with just books before eventually moving to sell consumers everything under the sun, from A to Z, all in one place.