Designing Voice Experience: The future of UX

voice design and voice experience

Voice Experience: What you should know

Who wouldn’t welcome “Jarvis” from Ironman and “Samantha” from “Her” into their lives to help them do better and faster tasks? Won’t you?

I see your grinning. Oh, stop it!

Ever since Apple changed the way we interact with devices, by giving us keyboard, mouse, touch screens. Technologies took us to a whole new ride than we ever thought. Today no task is easy to complete without these technologies and devices like booking air ticket, scheduling Uber, ordering food etc… But wait, will it stop here? Of course not.

A new kid on the block called “Voice assistants” like Siri, Cortana, Alexa etc. These assistants are going to change the way we experienced with technologies before, by becoming an inevitable part in our life. To reach a height of assisting a human, these voice assistants have to learn how to deliver information helps a user to get things done.

Amazon has democratized the process. They have opened Alexa up to masses. Brands, agencies, students, and individuals all have an opportunity to break the box open, go under the hood and use the tools Amazon has provided to create their own voice experiences.

While Alexa has found a home in Amazon Echo and Fire TV, Amazon is making a much larger play demonstration here. Developers can incorporate brains behind the beauty into any OS and even apps by using Alexa Skills Kit. All of these applications help to strengthen the brain. The open source sharing informs the system, helps knowledge base grow, and expands Alexa’s reach and importance in our daily lives.

So, How does a UX’er design better voice experience to help a user to get pieces of information precise and fast? Wait, what? How can you design a voice? Are you nuts?. Oh well, you can.

Voice recognition is tackling complex problems around identity and security. Researchers are looking at “converting voice to a barcode to identify every human.” 

The A, B, Cs of Voice Experience

“A personalized voice interface helps users to initiate an automated service or process with natural language in conversations, delivering precise and lightning-fast information to help users achieve their goals”

Being said, designing for “Tap To Talk” is different from “Hands-free”. Building a seamless conversational experience across devices can be achieved only by understanding how a user interacts with multiple scenario’s. While designing any voice experience make sure to meet basic guidelines to create a tangible experience. Follow these basic guidelines when you design a voice experience.

1. Conversation personalized.


This is one of the exciting things about voice. Where previously, designers might have helped determine the tone of copy, color schemes, or image guidelines. A great persona invoice is not about just having a pretty voice. It’s also about connecting with a user on the other end. When we hear a voice, we unconsciously make a lot of assumptions about that person. These assumptions include how intelligent that person might be or which region or country they’re from.


User: “Clark, put sugar, flour, and salt on the shopping list.”

Clark: “Added.”


User: “Clark, put sugar, flour, and salt on the shopping list.”

Clark: “I’ve added sugar, flour, & salt to your shopping list.”

User: “Thanks, Clark!”

2. Take care of the cognitive load.


When we see a new product and find it difficult to use, by default human brains will start to recollect previous experiences to compare and use the same way. This may lead to unexpected blocks. To avoid assumptions over using a product which doesn’t have physical or visual guidance, it is important to help users at any given point in time. Introduce the voice in scenario’s like helping, preventing, warning, etc.. so a user can engage quickly without a roadblock.


User: “Hey Clark, where’s a good place to go for sushi?”

Clark: “Emperor Sushi is nearby.”


User: “Hey Clark, where’s a good place to go for sushi?”

Clark: “There are several sushi restaurants in the area – would you like to walk, or drive?”

User: “It’s a nice day, I’m down to walk.”

Clark: “Ok, Emperor Sushi is a 2-minute walk from here.”

User: “Good to know. Thanks, Clark.”

3. Create an error strategy


Design for the assistants where they don’t hear anything or understand. Identify wrong turns and misinterpretations so that error strategy can be improved.

Margaret Urban quotes “Don’t ask a question if you won’t be able to understand it.”

Sometimes the VUI just won’t understand the answer. What to do then? Plan an appropriate response, such as asking a user to repeat their answer, speak more slowly, or even create a new set of questions if last answer was incomprehensible.


Clark: “Tell me your date of birth.”

User: “Ummm.”

Clark: “Error.”


Clark: “Tell me your date of birth.”

User: “Ummm.”

Clark: “Just tell me your date of birth using 2 digits for the month, 2 digits for the day, and 4 for the year.”

Show the status


Unlike interface, there is no display about the status of the system. Like storytelling, there should be a beginning, middle, and end of user interaction. The user can’t see where they are in the UI, so voice assistants need to tell what functionality they are using. With your music app, once a user is within a playlist, it should offer up relevant information.


User: “Book a ride from my home to office “

Clark: “Booked and sent the details to your phone”

User: “—–“


User: “Book a ride from my home to office “

Clark: “Hold on, looking for nearby drivers. Meantime would like to share your ride.”

User: “Yes, I will share my ride and tell me how long it will take?”

Clark: “Approximately 5 mins. Booked and details will be sent to your phone shortly. Have a good day.”

Security and access

Voice recognition is tackling complex problems around identity and security. Researchers are looking at “converting voice to a barcode to identify every human.” Alibaba’s Tmall Genie comes with built-in “Voice-based user identification,” and “Learn from past interactions, helping it improve its ability to serve as your digital assistant.” Such voice recognition features should also enable use of voice assistance in crowded noisy places.


User: “Clark, order 200 dozens of Roses and deliver it to my home.”

Clark: “Ordering right away. Ordered and sent the details to your phone. “


User: “Clark, order 200 dozens of Roses and deliver it to my home.”

Clark: “I’m finding some unusual activity which is in contrast to normal. For security purposes, please confirm and repeat the pin number which is sent to your phone.”

User: “—-”

Clark: “For security purposes, I’m canceling the order.”

Final Thoughts: It’s usually not the limitation of speech technology that’s responsible for a horrible voice experience. It’s usually designers not knowing how to Design Voice Experience that results in a less-than-desirable voice interface. Hopefully, this article will help you design for mere mortals.

Some principles apply to both bot design and to voice Experience, some don’t. I’ve chosen to focus on voice, although I hope it’s useful whatever you’re working on. To know more about process and guidelines involved in designing the VX I have shared a list of resources. I can’t wait to see how designers will make use of these to start building “Natural Voice Experiences”

Drop here to know more about us!

References: UX Guidelines:


References: Books:

  1. Designing Voice User Interfaces: Principles of Conversational Experiences,” Cathy Pearl, O’Reilly Media”

Prototyping and Testing:

  1. SaySpring “Free prototyping software for voice”
  2. “Alexa Skill Testing Tool”
  3. Web Simulator“Actions for Google”

Phrases and Dialects:

  1. How Y’all, Youse and You Guys Talk“New York Times”
  2. Defining the Voice Interface“Amazon”
  3. Defining Utterances for the Alexa Skills Kit“(including the tool), Maker Musings