Making #VoiceFirst More Than A Hashtag

For the past decade, brand experiences have moved from thinking from a PC-first mindset to a mobile-first mindset.

While this certainly took some time as people mentally transitioned into a new and much smaller website format, there’s another shift that’s not just on the horizon, but already here.

With over 14 million Amazon Echo devices already in circulation and VoiceLabs’ state of the voice device industry report estimating an additional 24.5 million voice-first devices will be shipped in 2017, it may be safe to say that consumers are easily adapting to this new experience.

One company that’s helping to bridge the transition into voice-first strategies is Sayspring, whose platform allows for interactive prototype creation to determine whether or not there’s enough “there, there” to move to the coding phase.

In this week’s The Matchmaker Is In series, Karen Webster and Matchmakers author David Evans spoke with Sayspring’s Founder Mark Webster and CTO Scott Werner about how thinking voice-first will be as transformative to business and commerce as the graphical user interface was to the web.

Here’s an excerpt from their conversation:

KW: What was the friction your team is working to solve and who was having the trouble?

MW: The world of voice is so new — it’s a brand new interface that’s hitting the mainstream. Voice interface has been primarily relegated to IVR systems and … calling American Airlines. Since it’s so new, product design people (U.S. designers) need the tools to build great product experiences. There’s a lot to figure out in these interfaces, and until those teams can access tools to manage those workflows and design those experiences, we’re not going to really take full advantage of the potential that voice promises.

KW: What was the gap that you’re observing in the skills needed for voice-enabled devices? Who’s having the problem?

SW: I think we’re really early in the world of voice. I don’t even think we’re in the first inning — I think we’re still walking up to the park on this. A lot of effort has been put into making it as easy as possible to build a skill. There’s a parallel to Microsoft’s Front Page in the mid ‘90s and trying to remove as much friction in the building process. Our belief is that great product experiences come from teams. Teams include designers and developers. How do you bring designers to the table and give them a voice in what’s possible here?

DE: So the idea that I come to you with my great idea and you tell me whether it’s stupid or not — assuming it’s not stupid, you work with me to perfect it?

MW: We are a set of tools to empower designers to go through that process. We remove all the technical barriers that are involved in working with the medium of voice to empower designers, UX professionals, product people, small business owners and marketers to be able to focus on what the experience is going to be and get to a point where you’re interacting with it the same way an end user ultimately will — at which point, you can make the decision to invest the time and resources to develop it.

KW: Is it sort of a tryout platform where you expose a set of tools and helpful interfaces that these designers can play around with to determine whether or not the experience is worth investing further — is that the goal?

MW: I think the best way to think about it is to make a parallel to the world of web or mobile. If we were to build a website, mobile app or feature for either, we would go through an ideation process, wire framing, design it in sketch or photoshop, put it in prototyping software like Envision or Adobe XD. We would do a design team review and put it in front of other stakeholders in the company. If we were in client services, we would put it in front of the client, users, get feedback. We iterate that design until we were happy with what that overall experience … would be and only then would we begin the development process.

DE: Have you developed any learning at this point from interacting with entities that are trying to develop apps/skills on what works and what doesn’t? Any principles that are emerging at this point?

MW: In the early days of voice, the most success came from having a well-defined user goal. What has become more challenging in voice is a discovery process. A lot of user expectation can go into the voice interface even though what’s powering that is the kind of stuff that we still struggle with on mobile in general, when it comes to figuring out what somebody wants. That’s one of the biggest [lessons] we’ve [learned] — the most successful learning experiences comes from helping somebody get something done.

KW: How do you make decisions about the kinds of capabilities your software platform needs to have in order to be responsive to the needs to those who’re seeking it out for this kind of help? What did you start with, and how are you evolving that?

SW: There are a few parts to that — one is [a] traditional good product process of getting something out in the world that you think provides value to early users, and just listening. One of the more challenging parts of our business — something that’s still so early — is building functionality that users don’t yet know they need yet. One example of that is that in the world of voice, there’s a markup language called SSML (Speech Synthesis Markup Language) that adds pauses, changes tone, inflection or pronunciation of different words — it lets you style a voice assistant’s voice to sound more natural or … more representative of your brand, and that’s a new world to a lot of people. Lots of people getting involved in voice design don’t necessarily know they need SSML or the power it can bring to the experience that they’re creating.

DE: What you think getting into the first inning will look like, and when will it start?

SW: I think voice is hard to wrap your arms around as far as how disruptive and transformative it will be in our daily lives. I don’t think voice is even on the measure of mobile. I think voice is on the measure of the introduction of the graphic user interface as far as how it’s going to change how we’re going to do things. When you think of the experience of sitting in your car and that cars are full of switches/buttons, that goes away. The concept of having remotes goes away.

The way we talk to devices, what we expect from devices will all change in a big way. When you look at the footprint of devices already — 20 million Amazon Alexas in U.S. households and it’s only been around for two years (it took PCs 10 years to get to that point). From a user adoption standpoint, it’s sweeping through our daily lives. The challenge as product people, designers and developers is creating experiences that people want to engage with and that voice interfaces can make a big difference in.

DE: Is adding the visual part of voice-enabled devices going to be a big development?

SW: Before mobile devices, websites, a lot of things got done by just talking to other people. Business primarily is talking to other people. The screen is going to be interesting for a lot of different experiences. It’s going to complement a lot of experiences, but I don’t think a lack of screen has held them back in any way. Taking voice as a voice-only interaction and a voice-powered interaction — those two modalities are going to push a lot forward.

KW: Is there a particular use case that you think is broad and deep enough to become the killer app that will get voice to the first inning and then unleash other things from there?

SW: One of the things we like to talk about is when you think about the transition from radio to TV, and how a lot of the first TV shows were people standing in front of a microphone — basically taking the experience of radio and putting a screen on it. For us to get to that killer app, we need to change our thinking. We need to think voice-first to get that killer app, and we’re not sure if that’s happened yet.

MW: When you look at mobile, Uber is a killer app for mobile. The first killer app we’re going to see for voice is just convenience. It’s going to change the way we interact with everything, and once that becomes part of our daily lives, we’ll see completely different use cases pop up.

KW: What did you learn from that experience and being part of Groupon that you brought into Sayspring as you’ve thought about launching your platform?

MW: One of the biggest lessons/influences … at Groupon [was when] we had the opportunity to work on their mobile app. They’re very rigorous about what the design process is for the mobile app. When you introduce new features, there’s a great design review process before development begins. Coming out of Groupon was very inspiring to think about best practices and how people will build voice applications. Pulling up a bit higher, Sidetour was an amazing business to be part of, but it was very different in a lot of ways. We were dealing with a lot of offline interactions with people.

KW: When you’re talking about voice-first and thinking mobile-first from your Groupon experience, what was the hardest shift for consumers to make in thinking mobile-first?

MW: The hardest part about Groupon was thinking about discovery. Groupon is ultimately based on buying things you didn’t know you wanted, and discovery is a lot different on mobile than it is on web. The challenge of how do you create great discovery experiences on the small screen of mobile — one of the biggest struggles for Groupon and every eCommerce marketer in general. The same struggles with mobile are happening with voice.

SW: The idea behind voice-first and mobile-first — the challenge is more on the product designer/manager and the people building these things. On the consumer side, when the people building the mobile and voice apps are thinking mobile- and voice-first, the consumer is having a better experience. They don’t have any challenges whatsoever in interacting [with] these things — that’s the goal when thinking mobile- or voice-first.