How Amazon Keeps Your TV From Accidentally Triggering Your Echo Amazon's use of acoustic fingerprinting helps to tell your Echo that it should ignore whatever's being said in the background -- including 'Alexa.'
By David Murphy
This story originally appeared on PCMag
If you haven't seen it by now, Amazon's Super Bowl LII commercial is fairly clever. Kudos to the company for coming up with a fun way to mash together celebrities and Alexa without it feeling overly cheesy or trying-too-hard. More importantly, a big thanks to Amazon's engineers who came up with an ingenious way to broadcast the "A word" without it triggering everyone's Echos -- whatever version of the device you own.
The most annoying thing about watching any YouTube video or televised commercial that mentions Alexa is that it typically triggers your Echo device to get ready to respond to a query. Or worse, the person you're watching who says "Alexa" just keeps on babbling, which then makes your Echo do something you didn't want it to do -- or just apologize for being unable to do whatever commands it tried to interpret.
And since Amazon definitely wanted its Super Bowl LII to mention she-who-shall-not-be-named, and refer to her frequently, the company had to come up with a different way to do so in order to avoid hacking off everyone who already owns an Echo device.
The solution? Acoustic fingerprinting.
"The trick is to suppress the unintentional waking of a device while not incorrectly rejecting the millions of people engaging with Alexa every day," said Shiv Vitaladevuni, a senior manager on the Alexa Machine Learning team, in an Amazon blog post.
Though Amazon isn't detailing the specific techniques its using to keep your Echo from triggering from its Super Bowl advertising, Bloomberg notes that a Reddit user, Asphyhackr, might have figured out Amazon's secret.
"I did a little research tonight and found that the Echo, while it's processing the wake word, searches the Audio Spectrum and if is significantly quieter in the area of 4000hz to 5000hz, she will not wake for the word," Asphyhackr writes.
"I found that when I analyzed the spectrum of them saying her name, the spectrums were significantly quieter in the range of 3000hz to 6000hz. In some of those recordings, those frequencies appeared to be non-existent. In others it appeared like the boosted the surrounding frequencies to make the Echo see a gap in the spectrum."
In other words, if your Echo (Best Price at Amazon) notices something strange happening in the audio spectrum, it realizes that it should ignore whatever is being said -- like "Alexa." And while this works well when Amazon has a planned announcement to make, like an advertisement, the company has to get a bit more creative when it can't anticipate the large-scale broadcast of its digital helper's name.
"When multiple devices start waking up simultaneously from a broadcast event, similar audio is streaming to Alexa's cloud services. An algorithm within Amazon's cloud detects matching audio from distinct devices and prevents additional devices from responding. The dynamic fingerprinting isn't perfect, but as many as 80 to 90 percent of devices won't respond to these broadcasts thanks to the dynamic creation of the fingerprints," reads Amazon's blog.