The picture is owned by, Inc. or its affiliates.

The picture is owned by, Inc. or its affiliates.




Everything started when we first got our Amazon Alexa. What can I say, my 6 year old daughter was over the hills when she found out that there is someone who is always present to listen to her and answer her endless questions! She gravitated the Amazon Alexa faster than any other technology. She was repeatedly asking Alexa for million times, “Alexa, tell me a joke!” “Sing a song.” “What is your name?”

What amazed me was how intuitive, fun, and convenient it is to have a conversation with Alexa. I became curious about the conversation design and how to create a service without any visual elements.


Amazon’s Voice Design Guide navigated my design process. I started interviewing my surroundings to explore and find out what people would like to accomplish at home using only their voice. The interviews revealed a large number of their needs, but what they had in common was the fact that majority of them would like to have an option to manipulate their car with voice.

Possible scenarios to describe the purpose of the voice interaction and ways users can interact with the Alexa Skill were drafted based on the collected data.




30 minutes from Sarah’s life

It is a regular hectic morning for Sarah. She needs to manage to fix a breakfast, wake up her 5 year old son, and help him to get ready for school in only about 30 minutes. While she puts the waffles into the toaster and sets up the table, she asks Alexa to find out the temperature in the car. Alexa responds that it’s 43 degrees F. inside the car. “It is freezing in the car,” thinks Sarah and walks into her son’s room to wake him up. Before leaving the house, Sarah asks Alexa to set 72 F. while having breakfast in the kitchen. She wants to make sure the car is warm enough to drive her still sleepy kid to school.


What will Sarah get from the skill that she cannot get another way?

  • Hands-free interaction with the car.
  • Emotional connection with the car through Alexa.
  • Remote voice control of the car.

Project Goal

Determine an optimal voice interaction model to ease user communication with the car (currently only Tesla) by:

  • Facilitating a dialogue between the user and the car following sequentially organized conversation model;
  • Providing immediate feedback on the user’s request meanwhile educating the user about the skill capabilities.


User-centered, collaborative and iterative design process were the key principles that guided this project. Collaboration with the developer was the key to idea generation and decision making throughout the overall design process. User research, including interviews and survey studies, and usability tests provided incredible input throughout the design process. In addition, comparative analysis of current competitive and analogous experiences and already existing technologies were taken into consideration while crafting our design.

Comparative analysis of current competitive and analogous experiences

While doing our desk research we discovered that there are a number of car companies that have developed an Alexa skill as a communication tool with their car, such as NissanConnect Services®* with Amazon Alexa allows the user to remotely start the Nissan, unlock the doors, or flash the lights. Another example is MyFord Mobile, Alexa skill developed by Ford motors. The users with a Ford plug-in vehicle can use the Alexa voice service to get a range of vehicle information, such as battery state of charge and electric range.

The research also found that there are more than one Alexa Skills available to manipulate Tesla. However, it seems that only two of the skills are officially recognized by Amazon Alexa: EV Car Alexa Skill and Mosaic workflow for Tesla. The interviews uncovered that the users find the communication with the skills impersonal. They also find inconvenient to create another account in order to use Mosaic workflow.

 All the data gathered through comparative analysis was compiled to reflect on during ideation

All the data gathered through comparative analysis was compiled to reflect on during ideation

In-depth interviews followed by contextual inquiry

In addition, to the comparative desk research, we found important to return to our users to observe them conversing with Alexa and later on conduct follow-up interview to help us understand what the person was doing and why. This approach helped us to get a better sense on how and why users do what they do; discover their attitudes, needs, and concerns while conversing with voice assistant, and what works and doesn’t.


Research outcomes and conversations with the developer on specific questions about building the skill led to the answer to the major business-related question: What will be the 1st version of the skill?

For MVP we decided to extend the functionality of the Tesla mobile application. The made decision helped us to frame the timeline and make the design decisions based on available development resources.

Rapid prototyping and iterating the conversational flows

Having Tesla API capabilities and user stories in mind: crafted based on the interviews, we created simple linear scripts and after the user flows based on the user intents. The user flow charts allowed us to see the design from the high-level and envision possible directions the user might take.

 User flow endless iterations

User flow endless iterations

Role play or a usability test?

What we did next? We took the scripts for the users to test. The goal of the usability test was to explore the possible conversation flows and phrasing in a lo-fi way. Here how we did it: I was for the voice assistant, using a script to respond to user commands in the moment. This approach allowed us to adjust the responses on the fly. I should admit that these interactions should be different in real context, but the usability test allowed us to explore the design directions and make one step forward to understand what the problems of the real users are.

 A brief dialog between the user and Amazon Alexa

A brief dialog between the user and Amazon Alexa


Key takeaways

Conducting lo-fi usability tests early in the process helped us to get a sense of the conversational flow the user might possibly follow and how they might phrase their thought. We realized that our design is lacking navigation actions that are recommended to help users to navigate the conversation.

Finding the right voice prototyping tool that fits the project

Considering the usability test outcomes we adjusted the scripts and iterated the user flow diagrams. We still needed to find a way to map out the different ways in which people phrase their intent. So we decided to create a phrase map to identify the utterances (phrasings) the Amazon Alexa skill will recognize.

 Phrase map for the battery state intent to visualize the utterances

Phrase map for the battery state intent to visualize the utterances

Now that all the necessary documents including the scripts, utterances, and user flows were updated, it was time to take our lo-fi prototype to the next level. Thus we explored available voice prototyping tools to test the design in a naturalistic setting (the real users home).

Here are the options that we explored most.

  • WOZ approach. Write a script that includes many possible options, select the appropriate option for the moment and use Text to Speech for Mac to read it aloud
  • SaySpring. Software to create voice-driven experience prototype with a drag-and-drop design tool.
  • Story Speaker Plugin for Chrome. Software to create talking, interactive stories without coding. Write a story in a Google Doc, push a button, and every Google Home device linked to the account can play it or you can play it on your Mac.
  • Botsociety. Software to preview and prototype chatbots and voice. Write the script, play to preview the prototype, share with a group, export if needed, send invite to testers to try out the design (1 user flow at a time) and start building (use Botsociety API to get a JSON representation of your mockups).

We used both Botsociety and Sayspring. The Botsociety was useful for visualizing possible scenarios in the flowchart and also playing it to check what the scripts sound like. For usability tests, we decided to use SaySpring as we found it more intuitive to use for this purpose. It also generates a script for each user flow, allowing you to follow along during testing. In addition, each time the prototype is used, a transcript of the interaction is created to review (3 Transcript limit for Free Plans). Here is an example one of the usability test participant’s scripts presented on Botsociety:


I decided to share my process while the project is not finished yet to get feedback from the community. I researched and found various ways designers approach the design of a conversational experience. Maybe this is because of the project differences or maybe because of the technical specifications and developmental resources.