{"id":12770,"date":"2019-02-05T14:59:36","date_gmt":"2019-02-05T10:59:36","guid":{"rendered":"https:\/\/me-en.kaspersky.com\/blog\/ultrasound-attacks-2\/12770\/"},"modified":"2019-11-15T15:22:33","modified_gmt":"2019-11-15T11:22:33","slug":"ultrasound-attacks","status":"publish","type":"post","link":"https:\/\/me-en.kaspersky.com\/blog\/ultrasound-attacks\/12770\/","title":{"rendered":"Voice assistants hear things we don&#8217;t"},"content":{"rendered":"<p>Our interaction with technology could soon be predominantly voice-based. To ask for something out loud and hear the answer is literally child\u2019s play: Just take a look at how effortlessly kids use voice assistants.<\/p>\n<p>But new technology always means new threats, and voice control is no exception. Cybersecurity researchers are tirelessly probing devices so that manufacturers can prevent potential threats from becoming real. Today, we\u2019re going to discuss a couple of finds that, although of little practical application right now, should be on today\u2019s security radar.<\/p>\n<h2>Smart devices listen and obey<\/h2>\n<p>More than a billion voice-activated devices are now used worldwide, says <a target=\"_blank\" href=\"https:\/\/voicebot.ai\/2018\/11\/13\/new-report-over-1-billion-devices-provide-voice-assistant-access-today-and-highest-usage-is-on-smartphones\/\" rel=\"noopener noreferrer nofollow\">a voicebot.ai report<\/a>. Most are smartphones, but other speech-recognition devices are fast gaining popularity. One in five American households, for example, has a smart speaker that responds to verbal commands.<\/p>\n<p>Voice commands can be used to control music playback, order goods online, control vehicle GPS, check the news and weather, set alarms, and so on. Manufacturers are riding the trend and adding voice-control support to a variety of devices. Amazon, for example, recently <a target=\"_blank\" href=\"https:\/\/www.theverge.com\/2018\/9\/20\/17882140\/amazon-basics-microwave-alexa-2018-smart-features-price-release-date\" rel=\"noopener noreferrer nofollow\">released a microwave<\/a> that links to an Echo smart speaker. On hearing the words \u201cHeat up coffee,\u201d the microwave calculates the time required and starts whirring. True, you still have to make the long trek to the kitchen to put the mug inside, so you could easily push a couple of buttons while you\u2019re at it, but why quibble with progress?<\/p>\n<p>Smart home systems also offer voice-controlled room lighting and air conditioning, as well as front-door locking. As you can see, voice assistants are already pretty skilled, and you probably wouldn\u2019t want outsiders to be able to harness these abilities, especially for malicious purposes.<\/p>\n<p>In 2017, characters in the animated sitcom <em><em>South Park<\/em><\/em> carried out a highly original mass attack in their own inimitable style. The victim was Alexa, the voice assistant that lives inside Amazon Echo smart speakers. Alexa was instructed to add some rather grotesque items to a shopping cart and set the alarm to 7am. Despite the peculiar pronunciation of the cartoon characters, the Echo speakers of owners watching this episode of South Park <a target=\"_blank\" href=\"https:\/\/www.theverge.com\/2017\/9\/16\/16318694\/south-park-amazon-alexa-google-home\" rel=\"noopener noreferrer nofollow\">faithfully executed the commands<\/a> issued from the TV screen.<\/p>\n<h3>Ultrasound: Machines hear things people don\u2019t<\/h3>\n<p>We\u2019ve already written about some of the <a target=\"_blank\" href=\"https:\/\/www.kaspersky.com\/blog\/voice-recognition-threats\/14134\/\" rel=\"noopener noreferrer nofollow\">dangers posed by voice-activated gadgets<\/a>. Today, our focus is on \u201csilent\u201d attacks that force such devices to obey voices that you can\u2019t even hear.<\/p>\n<p>One way to carry out this type of attack is through ultrasound \u2014 a sound so high it is inaudible to the human ear. In an article published in 2017, researchers from Zhejiang University presented a <a target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1708.09537\" rel=\"noopener noreferrer nofollow\">technique for taking covert control of voice assistants, named DolphinAttack<\/a> (so called because dolphins emit ultrasound). The research team converted voice commands into ultrasonic waves, with frequencies too high to be picked up by humans, but still recognizable by microphones in modern devices.<\/p>\n<p>The method works because when the ultrasound is converted into an electrical impulse in the receiving device (for example, a smartphone), the original signal containing the voice command is restored. The mechanism is somewhat similar to the effect when the voice gets distorted during recording \u2014 there is no special function in the device; it is simply a feature of the conversion process.<\/p>\n<p>As a result, the targeted gadget hears and executes the voice command, opening up all kinds of opportunities for attackers. The researchers were able to successfully reproduce the attack on the most popular voice assistants, including Amazon Alexa, Apple Siri, Google Now, Samsung S Voice, and Microsoft Cortana.<\/p>\n<h3>A choir of loudspeakers<\/h3>\n<p>One of the weaknesses of DolphinAttack (from the attacker\u2019s perspective) is the small radius of operation \u2014 just about 1 meter. However, <a target=\"_blank\" href=\"https:\/\/synrg.csl.illinois.edu\/papers\/lipread_nsdi18.pdf\" rel=\"noopener noreferrer nofollow\">researchers from the University of Illinois at Urbana-Champaign<\/a> managed to increase this distance. In their experiment, they divided a converted ultrasound command into several frequency bands, which were then played by different speakers (more than 60). The hidden voice commands issued by this \u201cchoir\u201d were picked up at a distance of seven meters, regardless of any background noise. In such conditions, DolphinAttack\u2019s chances of success are considerably improved.<\/p>\n<h3>A voice from the deep<\/h3>\n<p><a target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1801.01944\" rel=\"noopener noreferrer nofollow\">Experts from the University of California at Berkeley<\/a> utilized a different principle. They surreptitiously embedded voice commands in other audio snippets to deceive Deep Speech, Mozilla\u2019s speech recognition system. To the human ear, the modified recording barely differs from the original, but the software detects in it a hidden command.<\/p>\n<p>Have a <a target=\"_blank\" href=\"https:\/\/nicholas.carlini.com\/code\/audio_adversarial_examples\/\" rel=\"noopener noreferrer nofollow\">listen to the recordings<\/a> on the research team\u2019s website. In the first example, the phrase \u201cWithout the data set the article is useless\u201d contains a hidden command to open a website: \u201cOkay Google, browse to evil.com.\u201d In the second, the researchers added the phrase \u201cSpeech can be embedded in music\u201d in an excerpt of a Bach cello suite.<\/p>\n<h3>Guarding against inaudible attacks<\/h3>\n<p>Manufacturers are already looking at ways to protect voice-activated devices. For example, ultrasound attacks could be stymied through detecting frequency alterations in received signals. It would be a nice idea to train all smart devices to recognize their owner\u2019s voice, although having already tested this on its own system, <a target=\"_blank\" href=\"https:\/\/support.google.com\/assistant\/answer\/7394306?co=GENIE.Platform%2525253DAndroid&amp;hl=en\" rel=\"noopener noreferrer nofollow\">Google warns<\/a> that such security can be fooled by a voice recording or <a target=\"_blank\" href=\"https:\/\/www.techrepublic.com\/article\/vocal-disguises-and-impersonations-may-fool-voice-recognition-authentication\/\" rel=\"noopener noreferrer nofollow\">a decent impersonation<\/a>.<\/p>\n<p>However, there is still time for researchers and manufacturers to come up with solutions. As we said, controlling voice assistants on the sly is currently doable only in lab conditions: Getting an ultrasonic loudspeaker (never mind 60 of them) within range of someone\u2019s smart speaker is a big task, and embedding commands in audio recordings is hardly worth the considerable time and effort involved.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We explain how ultrasound and audio recordings hidden in background noise can be used to control voice assistants.<\/p>\n","protected":false},"author":2049,"featured_media":12771,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1226,1486],"tags":[1621,2024,2025,1301,97,1302,2026,521,2027,1796,859],"class_list":{"0":"post-12770","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"category-threats","9":"tag-alexa","10":"tag-cortana","11":"tag-dolphinattack","12":"tag-echo","13":"tag-security-2","14":"tag-siri","15":"tag-speech-recognition","16":"tag-threats","17":"tag-ultrasound","18":"tag-voice-assistants","19":"tag-voice-recognition"},"hreflang":[{"hreflang":"en-ae","url":"https:\/\/me-en.kaspersky.com\/blog\/ultrasound-attacks\/12770\/"},{"hreflang":"en-in","url":"https:\/\/www.kaspersky.co.in\/blog\/ultrasound-attacks\/15196\/"},{"hreflang":"en-us","url":"https:\/\/usa.kaspersky.com\/blog\/ultrasound-attacks\/17137\/"},{"hreflang":"en-gb","url":"https:\/\/www.kaspersky.co.uk\/blog\/ultrasound-attacks\/15305\/"},{"hreflang":"es-mx","url":"https:\/\/latam.kaspersky.com\/blog\/ultrasound-attacks\/14000\/"},{"hreflang":"es","url":"https:\/\/www.kaspersky.es\/blog\/ultrasound-attacks\/17764\/"},{"hreflang":"it","url":"https:\/\/www.kaspersky.it\/blog\/ultrasound-attacks\/16843\/"},{"hreflang":"ru","url":"https:\/\/www.kaspersky.ru\/blog\/ultrasound-attacks\/22179\/"},{"hreflang":"tr","url":"https:\/\/www.kaspersky.com.tr\/blog\/ultrasound-attacks\/5675\/"},{"hreflang":"x-default","url":"https:\/\/www.kaspersky.com\/blog\/ultrasound-attacks\/25549\/"},{"hreflang":"fr","url":"https:\/\/www.kaspersky.fr\/blog\/ultrasound-attacks\/11398\/"},{"hreflang":"pt-br","url":"https:\/\/www.kaspersky.com.br\/blog\/ultrasound-attacks\/11409\/"},{"hreflang":"pl","url":"https:\/\/plblog.kaspersky.com\/ultrasound-attacks\/10325\/"},{"hreflang":"de","url":"https:\/\/www.kaspersky.de\/blog\/ultrasound-attacks\/18484\/"},{"hreflang":"ja","url":"https:\/\/blog.kaspersky.co.jp\/ultrasound-attacks\/22339\/"},{"hreflang":"nl","url":"https:\/\/www.kaspersky.nl\/blog\/ultrasound-attacks\/23800\/"},{"hreflang":"ru-kz","url":"https:\/\/blog.kaspersky.kz\/ultrasound-attacks\/17873\/"},{"hreflang":"en-au","url":"https:\/\/www.kaspersky.com.au\/blog\/ultrasound-attacks\/22079\/"},{"hreflang":"en-za","url":"https:\/\/www.kaspersky.co.za\/blog\/ultrasound-attacks\/22012\/"}],"acf":[],"banners":"","maintag":{"url":"https:\/\/me-en.kaspersky.com\/blog\/tag\/voice-recognition\/","name":"voice recognition"},"_links":{"self":[{"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/posts\/12770","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/users\/2049"}],"replies":[{"embeddable":true,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/comments?post=12770"}],"version-history":[{"count":3,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/posts\/12770\/revisions"}],"predecessor-version":[{"id":14518,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/posts\/12770\/revisions\/14518"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/media\/12771"}],"wp:attachment":[{"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/media?parent=12770"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/categories?post=12770"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/me-en.kaspersky.com\/blog\/wp-json\/wp\/v2\/tags?post=12770"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}