Voice-activated AI is certainly having its moment in the sun. What was a concept mostly in the realm of science fiction a decade ago, is now widely predicted to be the “next big thing” in life-changing technology. Amazon’s Alexa, Google’s Allo, Apple’s Siri, Microsoft’s Cortana, Samsung’s Bixby (should it ever learn to speak English) — all have different skill sets and passionate adherents — and collectively generate a lot of buzz.
And some of that buzz, says Chirp CEO and Executive Director Moran Lerner, makes some sense, since technologists, innovators and investors see voice-controlled as an important commerce and data enabler. That’s exciting, Lerner said, particularly when wedded to a consumer interface that is easy to use.
“If you look at voice, it is [a] beautiful technology, because it really touches humans on [a certain] level. It is something they are familiar with in all of their day-to-day interactions.”
That is the good news, Lerner says, and there is certainly the potential for the use and expansion of this technology.
The bad news? There is too much buzz for something that isn’t going to have as big a payout as everyone is hoping for.
“It is not going to be the be-all and end-all. It will have a place in our lives; it just won’t be as big as everyone is making [it] out to be.”
There is a place for voice AI — Lerner told Karen Webster in their Topic TBD chat — but it might end up being about 30 percent as big as the current predictions.
“A few really solid niche cases, like connected cars or connected homes, but it just won’t be a constant companion for consumers in their day-to-day experiences.”
And before it can even get that far, Lerner notes, Voice AI has a long, long way to go.
“This is a very fledgling field dominated by a handful of companies. We’ve probably seen over the last 20 or 30 years lots of massive efforts where millions have been spent building infrastructure for what is meant to be the holy grail of commerce … only for them to fade away to quickly be replaced by something else.”
Lerner is not a complete voice AI cynic — he doesn’t think its future is a slow fade-away so much as it won’t be the full tilt technological revolution that is currently being billed and will have a lot of issues getting really ignited.
It’s Not Interoperable
Mobile payments, Lerner noted, offer a pretty good road map of what can happen with the next “big things”: “it was tipped to be the biggest thing in commerce, and it is anything but at the moment. A lot of time and resource and effort and media attention will go to one or two things because the usual suspects [step] up — look at Apple and Apple Pay — and suddenly mobile was the revolution of our times.”
And with voice, he noted, once again the big players (the Apples, Amazons, Googles and Samsungs) are all lined up with slightly different variations on the theme.
Individually, they all have their strengths and weakness. Across the board they need to tighten up security loopholes and make the voice recognition work better so that Alexa isn’t accidentally taking orders from Domino’s commercials. But their overriding weakness is that they don’t work together: they are all little walled gardens.
“You [are] never going to get these tech giants to become interoperable with one another. They want to win their market share.”
They also want the data — the petabytes and petabytes of it that the era of the internet of things is about to supply.
And while that is a sensible question to ask for the brands, it does nothing for consumers who are making not one, but a host of product decisions when they start interacting with one system or another.
“Think about a customer who goes [to] a restaurant of the future and is served by a robot powered by Siri. I would hope that they will be compatible, because if they had an Android phone, they couldn’t eat. But today when we come to the IoT, the devices that talk to devices — that’s actually how they are being designed. Something that is compatible with one platform — they are being [designed] not to be compatible with other things that are competitive technologies.”
The big players are recruiting heavily and spending extraordinary sums of money to do it, trying to make sure as many devices and developers are building for their platform. But it is a limit, one that is pushing them into the next big problem with Voice AI tech: it is not relevant enough.
The Human Center
The question that often gets missed is the one about how technologies affect our day-to-day lives, says Lerner — because no one loves the answer.
“Personally, at the moment I see them more as a gimmick, and we have a long history of gimmicks that are supposed to take the world by storm and never do.”
The tech, he notes, varies and is always going to. The real center of the interaction is the human being who needs places to use the technology, and a reason to use them.
“I’ve had iPhones for years, [but] I can’t remember the last time I asked Siri to do anything. I am used to my behavior to get an answer with Google, because typing it out isn’t hard,” Lerner said. “If you trying to change people’s behavior, it isn’t just about giving them different technology. It is about giving them a reason to choose the behavior.”
This, he notes, is still an early design problem with voice AI — it is designed for some of the population, but not really all of it as yet, because it is in early days. The future will probably find places where voice works — connected cars for example, or connected houses. But even then, he said, it isn’t just about finding this silver bullet technology that changes the world of consumer interaction, so much as it is about adding another item to the portfolio of services that make customers more able to do the things they really want to do.
Some of that stuff is cutting edge — and thus very exciting — like Voice AI. Some of it is a bit older — like Bluetooth and Wi-Fi — that make all the devices able to interact.
“We created technologies that we [think simplified] people’s lives, but in many cases, they’ve overcomplicated what wasn’t hard before.”
Getting it down to actual use cases — and the actual experience consumers want — that will be a turning point for Voice AI. And one that will be tough, because the final big hurdle is the one that many, many world-changers have wrecked on in the past: scale.
The thing people often forget about payments and commerce, Lerner noted, is they didn’t just start happening in society 25 years go along with the internet — and that most of what makes the world go around is a bed of “really old legacy systems.”
Those systems have been the downfall of many an up-and-comer, Lerner noted, because what they build — not matter how innovative — is not compatible with the old tech that “[makes] up half of banking and financial services infrastructure.”
And voice, he said, faces a big challenge in that regard, since by his estimate maybe one third of all payments infrastructure is able to support voice at all. And that is a big hurdle to overcome.
“If you look at payments and mobile payments, there were a lot of new technologies that wanted to compete with Mastercard and Visa. They didn’t get very far.”
But it is the hurdle that must be overcome, because it’s the only way to really develop the technology past these early phases to spread it everywhere and find where it works best. That isn’t an overnight solution. It will be a very iterative process, and that means firms are going to have to stop working at winning and start working at building interoperable systems that can get to scale so they can realistically meet consumer needs.
“You get a lot of tech firms coming out today that make a lot of bold claims that they are going to replace something else. What they need to be doing is developing technologies that will complement each other — they try to create an ecosystem that allows the users and customers to be able to use different things quite seamlessly.”
Lerner doesn’t believe it will happen overnight. And he thinks a lot of VCs are going to lose some money, because Voice isn’t going to be quite the mainstream game changer that the hype machine is hyperventilating about.
But Voice will have a future, as long as it consistently answers two questions, as far as Lerner is concerned.
“Who are we actually building these technologies for — and what is the easiest way for them to use it?”