Semafone’s Dos And Don’ts Of The Voice Challenge With Amazon Alexa

Alexa Voice Challenge Echo

Hello there. Hey. Hi. How ya doin’? If you’re a human, these greetings likely mean the same thing to you — but if you’re a voice-activated assistant, things may not be so simple.

Yet, according to Ben Rafferty, Global Solutions Director at Semafone, that is exactly what developers should be aiming to teach these machines — and it should be top of mind for any organization participating in the PYMNTS 2018 Voice Challenge with Amazon Alexa, he said.

When different customers ask an artificial intelligence (AI)-based virtual assistant the same question, they’re likely to do so with different words, said Rafferty. Therefore, the technology must be able to make sense of different ways of phrasing the same request.

Furthermore, the assistant must be intuitive enough to figure out what customers need or want from it — sometimes even before customers themselves are certain. This is especially important if it is a customer’s first time interacting with the company or even their first conversation with a voice-activated assistant.

Rafferty said the challenge boils down to this: “How can you serve the largest amount of people on their first attempt at interacting with that voice assistant?” That’s the question he explored during a recent interview with PYMNTS, with a focus on best practices for Voice Challenge participants.

 

Writing on the Blank Slate

When creating a voice skill for Amazon Alexa, Rafferty said developers must assume users knows nothing about the skill, the concept behind it or how to use it. It’s even smart, said Rafferty, to assume they know nothing about Alexa — they have a blank slate, coming to the conversation knowing nothing other than the goal they wish to accomplish through the interaction.

A flexible AI system is key, Rafferty said, but flexibility must start with a company gaining understanding of its customers and what they most likely want or need when engaging with the voice-activated assistant. The most common questions should have the easiest-to-find answers. That is equally true for generalized AIs serving groups or specific types of customers, as well as for those serving individuals, he said.

Whether the customer wants to find out the weather or make a payment, the path to completing that task should be as easy as possible, with a support mechanism kicking in to bump the user back on track if they deviate, said Rafferty — similar to bumpers keeping even the most errant bowling balls out of the gutter.

The trick is to do so without either coddling or overwhelming the user, whilst informing them of the correct way to interact next time, Rafferty said. Instead, the user interface must offer informative, useful explanations of what is expected from the consumer when needed. These should not be necessary often, Rafferty said, as the interface should be simple and intuitive enough to serve the majority of customers without correction.

 

A Mature Voice Skill

Intuitive AI uses data collected over time to serve more people on the first try. But in the early days, this data has not yet been generated, so how can the majority of customers best be served?

Rafferty said the solution should be intricately tuned from day one. Developers should not depend on incoming data to make the system function, but rather look forward to using that data in the future to make the system function even better.

Helmuth von Moltke the Elder, chief of staff of the Prussian army before World War I, once said, “No battle plan survives first contact with the enemy.” Rafferty said it’s the same for any business plan, especially one involving AI: No voice skill survives first contact with the customer.

In other words, no tech solution ever functions exactly the same in real life as it did during development.

The solution? “Test and test and test again,” Rafferty said. Test internally among employees. Test with beta groups in the general public. This is how to finetune a machine before it reaches mass rollout.

If users stray from the intended path (and they will), Rafferty said the skill should ask them to clarify, “Did you mean A or B?” — where A is the most common customer goal and B is the second most common. He said this can help get customers back on the “happy path” without presenting so many options that they feel overwhelmed and give up.

 

The Role of Machine Learning

When it comes to situational commerce, Rafferty drew a line between AI and machine learning.

Say a customer has an Alexa system in her car. Every morning, she uses the system to order her favorite Starbucks drink and pick it up on the way to work.

AI, said Rafferty, is responsible for placing the order and completing the transaction in the moment — but it’s only thanks to machine learning that the system can make sense of the user’s location and habits to anticipate her next move and give her the option to order that drink without initiating the conversation herself.

 

Pro Tip

Without giving too much away about Semafone’s Voice Challenge entry, Rafferty said AI and machine learning don’t naturally apply to the skill, which involves the generation and verification of unique voice multifactor authentication keys.

However, knowing that others will be working on projects that have more to do with AI and machine learning, Rafferty volunteered the following advice: Don’t give users too many options.

He said the skill set is different for users on web or mobile compared to those using voice. On web or mobile, users can look away from the screen and then look back to see all the options still displayed in front of them. With voice, he said, most people can’t remember beyond three to five options.

That’s why it’s important to put the most common answers up front, he said. It’s always possible to lead users further down the happy path by giving them the choice to say they want “something else;” but if they stray too far from the path, it may be too late to get them back.