There’s no denying that Google Assistant is useful for simple, everyday needs that keep users from having to reach for a phone. But what if it could provide more value-added experiences, becoming more intuitive and human-like in the process? Are we that far away from the kind of assistant depicted in Spike Jonze’s Her?
One of the greatest inhibitors of adoption of voice is that natural language isn’t ubiquitous, and functionality is typically limited to quick shortcuts. According to Forrester, 46% of adults currently use smart speakers to control their home, and 52% use them to stream audio. Neither of these use cases are necessities, nor are they very unique. Looking beyond shortcuts and entertainment value, we sought to experiment with Google Assistant to highlight a real-world utility that offers more human-like interactions. Think less in terms of “Hey Google, turn on the kitchen lights,” and instead something more like “Hey Google, fix my marriage.”
That’s not a joke; by providing a shoulder to cry on or a mediator who can resolve conflicts while keeping a level head, our internal R&D team MediaMonks Labs wanted to push the limits of Google Assistant to see what kind of experiences it could provide to better users’ lives and interpersonal relationships.
Who would have thought that a better quality of conversation with a machine might help you better speak to other humans? “Most of the stuff on the Assistant is very functional,” says Sander van der Vegte. “It’s almost like an audible button, or something for entertainment. The marriage counselor is neither, but could be implemented as a step before you look for an actual counselor.”
Why Google Assistant?
Google Assistant is an exciting platform for voice thanks to its ability to be called up anytime, anywhere through its close integration with mobile. “Google Assistant is very much an assistant, available to help at any moment of time,” says Joe Mango, Creative Technologist at MediaMonks.
But still, the team felt the platform could go even further in providing experiences that are unique to the voice interface. “Right now, considering the briefings we get, most of the stuff on the assistant is very functional,” says Sander van der Vegte, Head of MediaMonks Labs. “It’s designed to be a shortcut to do something on your phone, like an audible button. This marriage counselor has a completely different function to it.”
The Labs team took note when Amazon challenged developers to design Alexa skills that could hold meaningful conversations with users for 20 minutes, through a program called the Alexa Prize. It offered an excellent opportunity to turn the tables and challenge the Google Assistant platform to see how well it could sustain a social conversation with users, resulting in a unique action that requires the assistant to use active listening and an empathetic approach to help two users see eye to eye.
Breaking the Conversation Mold
As you might imagine, offering this kind of experience required a bit of hacking. To listen and respond to two different people in a conversation, the assistant had to free itself from the typical, transactional exchange that voice assistant dialogue models are designed for. “We had to break all the rules,” says Mango—but all’s fair in love and war, at least for a virtual assistant.
A big example of this is a novel use of the fallback intent. By design, the fallback intent is a response the assistant provides to users when they make a query that isn’t programmed to a response—usually something as simple as asking the user to try to state their request in another way.
But the marriage counselor uses this step to pass the query along to sentiment analysis with Google Cloud API. There, the statement is scored on how positive or negative it is. Tying this score to a scan of the conversation history for applicable details, the assistant can pull a personalized response. This allows both users to speak freely through an open-ended discussion without being interrupted by errors.
What does such an interaction look like? When a couple tested the marriage counselor action, one user mentioned his relationship with his brothers: some of them were close, but the user felt that he was becoming distant from one of them. In response, the assistant chimed in to remind the user that it was good that he had a series of close relationships to confide in. Its ability to provide a healthy perspective in response to a one-off comment—a comment not even about the user’s romantic relationship, but still relevant to his emotional well-being—was surprising.
The inventive use of the platform allows the assistant to better respond to a user’s perceived emotional state. Google is particularly interesting to experiment with thanks to its advanced voice recognition models; it built the sentiment analysis framework used within the marriage counseling action, and Google’s announcement of Project Euphonia earlier this year, which makes voice recognition easier for those with speech impairments, was a welcome sight for those seeking to make digital experiences more inclusive. “At MediaMonks, we’re finding ways to creatively execute on these frameworks and push them forward,” said Mango.
Giving Digital Assistants the Human Touch
But the marriage counselor action is more focused on listening rather than speaking, allowing two users to hash it out and doling out advice or prompts when needed. A big part of this process is emotional intelligence. Humans know that the same sentence can have multiple meanings depending on the tone used—for example, sarcasm. Another example might be the statement “Only you would think of that,” which could be viewed as patronizing or a compliment given the tone and context.
At MediaMonks, we’re finding ways to creatively execute on these frameworks and push them forward.
While the assistant currently can’t understand tone of voice, a stopgap solution was to enable it to parse meaning with through vocabulary and conversational context—helping the assistant understand that it’s not just what you say, but how you say it. This is something that humans pick up on naturally, but Mango drew on linguistics to provide the illusion of emotional intelligence.
“If the assistant moves in this direction, you’ll get a far more valuable user experience,” says van der Vegte. One example of how emotional intelligence can better support the user outside of a counseling context would be if the user asks for directions somewhere in a way that indicates they’re stressed. Realizing that a stressed user who’s in a hurry probably doesn’t want to spend time wrangling with route options, the action could make the choice to provide the fastest route.
Next Stop: More Proactive, Responsive Assistants
“There’s always improvements to be made,” says Mango, who recognizes two ways that Google Assistant could provide even more lifelike and dynamic social conversations. First, he would like to see the assistant support emotion detection through more ways than examining vocabulary. Second, he’s like to make the conversation flow even more responsive and dynamic.
“Right now the conversation is very linear in its series of questions,” he says. But in a best-case scenario, the assistant could provide alternative paths based on user response, customizing each conversation to respond to different underlying issues that the marriage counselor might identify is affecting the relationship.
But for now, the team is excited to tinker and push the envelope on what platforms can achieve, inspired by a sense of technical curiosity and the types of experiences they’d like to see in the world. “It speaks a lot to the mission of what we do at Labs,” said Mango. “We always want to push the limitation of the frameworks out there to provide for new experiences with added value.”
As assistants become better equipped to listen and respond with emotional intelligence, their capabilities will expand to provide better and more engaging user experiences. In a best-case scenario, an assistant might identify user sentiment and use that knowledge to recommend a relevant service, like prompting a tired-sounding user to take a rest. Such an advancement would allow brands to forge a deeper connection to users by providing the right service at the right place in time. While Westworld-level AI is still far off in the distance, we’ll continue chatting and tinkering away at teaching our own bots the fine art of conversation—and we can’t wait to see what they’ll say next.
Make our digital heart beat faster
Get our newsletter with inspiration on the latest trends, projects and much more.