August 01, 2018 — Blog Post
The Future of Voice User Interface Design: An interview with Ilana Shalowitz
Ilana Shalowitz is the manager of Voice User Interface Design for Wolters Kluwer’s Emmi® patient engagement programs. On Tuesday, Aug. 7, she will give the closing keynote at the Voice of Healthcare Summit at Harvard Medical School in Boston. The conference is intended to provide thought leadership and sharing of best practices around voice-first technology in healthcare.
We talked to Ms. Shalowitz ahead of the conference to glean some insight into the themes she will address.
Why is the human voice important in the healthcare arena?
Standard text-to-speech interfaces are good enough for some uses but not for others. Text-to-speech is usually the robo-voice. In many cases it is used for when you are asking about the weather, or asking directions.
When you get to something like health, it has so many layers, so many meanings for people — culturally, socially, emotionally – it’s important to have that human voice speaking to you. You need to be able to imbue that voice with layers of meaning that are going to connect with patients on a topic as complex as healthcare.
Is something lost when the human voice is not present?
What’s lost is all those layers of relational meaning. Word choice can only take you so far. If you layer on the cadence of your speech, the tone, the inflection, you can have a much richer experience and a stronger connection with your conversation partner.
How do you begin to design a voice user script, or a voice experience?
With Emmi programs, we always start with research first. We seek to understand what are the needs of our clients and what are the needs of patients and bring them together into this experience over voice. Only by needs identification of both these groups are we able to identify the right content structure and tone, to make them engaging for patients and successful for our clients.
By needs identification, I mean: What are they looking to get out of this outreach? For our clients it might be medication adherence, or reduced readmission rates, or an increase in appointment attendance. But for the patients we reach out to, it may be more subtle — especially when managing chronic conditions. There are many lifestyle components that go into condition management that are often unsupported.
How do you choose the type of voice that you put on an outgoing recording?
The voice of Emmi was chosen about 15 years ago. Back when, we had to think about the persona. Who is it that our customers will be interacting with? What are the promises that our brand is going to make? How are we going to keep them? What are going to be the guideposts along the way telling us we’re building a strong relationship?
With voice experience, you need to build the persona and think through carefully what that represents to your customers, and whether you’re going to be able to deliver on the promise that the persona makes.
When coming up with the Emmi voice, we thought about who is going to be the most effective voice to interact with. Who are the patients going to trust? Who are they going to continue to pick up the phone for? Who are they going to continue to engage with?
After some testing, we decided on the persona of a trusted advisor. From testing we know that her persona, which is a combination of the tone that she takes as well as the words she uses and the way we construct conversations, has been very effective across patient populations in motivating people to take action for their health.
What was your background that drew you into this line of work?
I have been with Emmi three years. After my masters in marketing, I was really interested in being in some sort of product design.
When I entered this position, I brought with me an ear that was tuned to music from my singing group and experience writing patient engagement material for a large health system. Beyond that I had trained myself in design thinking and in the research process, which were grounded in my bachelor’s degrees in anthropology and psychology.
Like most voice designers, I have a mixed background that just came together for designing these aural experiences.
What are the technical challenges that have to be overcome to reach the next level of VUI?
In the near future we can think about biculturalism, multiculturalism. People are multidimensional. They may talk about health topics differently than they speak about finding the right motivational strategies to get their kids to do their homework. All of us do this code switching all the time. As the technology expands, humans are being trained to interact with systems in a unidimensional way.
But I imagine in the future, where AI can tell more about the context of the conversation, and the valence of it — how positive or negative are we in entering this interaction – AI will be able to more deftly navigate the subtext of the interaction. So the person can more quickly and comfortably achieve their goal, whether that be transactional, ordering something from the internet, or relational, like playing a game with their device.
What are the topical themes or big issues right now in the world of VUI?
In the keynote I’m going to talk about lessons that we’ve already learned from various waves of new technology, and how we take those forward with us as we approach designing voice technology as it intersects with AI. It’s really tempting to get distracted and chase after cool projects with new technology, but we shouldn’t forget all the lessons we’ve learned so far.
For example, starting with user research. Going back to identifying the needs of all your stakeholders, so you know how to design your content, and what content will be most valuable. There are also lessons from marketing about building a persona.
We have a whole discipline of linguistics experts who play in the realm of words. When you combine that with anthropologists and sociologists, who play in the realm of interactions … We just have such a rich history of working with things that are like voice, like conversation, and all the micro-interactions that happen there. We need to remember those lessons as we get into incorporating this new tool.