An issue that has been raised recently is the risk of a voice-driven assistant like Apple’s Siri, Amazon’s Alexa, Google’s Assistant or Microsoft’s Cortana being triggered inadvertently and becoming of nuisance value.
This was discovered with Amazon’s Echo devices where you could say “Alexa, laugh” and Alexa would laugh in response. But if this was said in conversation or through audio or video content you had playing in the background, this could come across very creepy. A similar situation was discovered in 2014 with Microsoft’s XBox when there was a voice-search functionality built in to in and you would wake it by saying “XBox on!”, This was aggravated if, for example, a TV commercial from a consumer electronics outlet was playing and the adman announced a special deal on one of these consoles by saying something like “XBox On Special” or “XBox On Saie” which contain this key phrase.
Similarly, we are starting to see “voice-driven search” become a part of consumer electronics and this could become of an annoyance whenever dialogue in a movie or TV show or an adman’s talking in a TV commercial could instigate a search routine during your TV viewing.
But there are some implementations of these voice assistants that don’t start automatically when they hear your “wake phrase” associated with them like “Alexa” or “Hi Siri”. In these cases, you would press a “call” button to make the device ready to listen to you. This typically happens with smartphones, tablets, computers or smart-TV remote controls.
On the other hand, some of the smart speakers like Google Home use a microphone-mute button which you would activate if there is a risk of nuisance triggering. In this mode, the device’s microphone isn’t active until you manually disable it.
Personally, I would still like to see some form of manual control offered as the norm for these devices, preferably in the form of a “call” button with a distinct tactile feel when pressed. Then you would see a different light glow or other visual cue when the device is ready to talk to. Here, the user has some form of control over when the device can listen to them thus assuring their privacy.
Here, the article underscored the role of speech as part of a user interface that integrated one of many different interaction types like touch or vision. This then provides different comfort zones that the user can benefit from when using the device and they then rely on what’s comfortable to them.