Documents from the trove of Edward Snowden, the former NSA contractor, shine a light on the US government’s use of sophisticated computer programs to monitor, record, transcribe and analyse all sorts of speech.
The NSA has for over a decade routinely increased the capabilities of speech recognition software, according to a report published this week, and for years has relied on systems alleged to give investigators the ability to scan millions of recorded conversations and query keywords for specific content, including even those in languages unfamiliar to analysts.
Snowden documents show that programs used by the NSA “automatically recognize the content within phone calls by creating rough transcripts and phonetic representations that can be easily searched and stored,” according to a security report. Intelligence analysts with the NSA began using a program code-named RHINEHART in 2004, according to the documents, which was “designed to support both real-time searches, in which incoming data is automatically searched by a designated set of dictionaries and retrospective searches, in which analysts can repeatedly search over months of past traffic.”
Two years later, a new system “designed to index and tag 1 million cuts per day” had already been rolled out in Iraq, with advances regularly occurring in the span since. That August, an internal memo shows that the NSA was already touting the capabilities of its own sophisticated “Google for Voice” to prioritize signal interception in real time.
By 2008, documents provided by Snowden show that the NSA had implemented a program called Enhanced Video Text and Audio Processing, or EViTAP, which provided a “fully-automated” service to analysts. The agency was using it at the time to keep track in real-time of news broadcasts happening in Arabic, Mandarin Chinese, Russian, Spanish, English and Farsi/Persian, transcribing those dispatches on the fly and exporting it in the form of English language text. Coupled with the telephone surveillance conducted by the NSA, the tool has reportedly been useful with regards gathering and analysing intelligence which might have otherwise been hard to spot because of language barriers.
One document published by The Intercept, dated May 2011 and written by a security intelligence analysis technical director for the agency’s Texas office, claims Human Language Technology, or HLT, has been used local to find tunnels in Tijuana, spot bomb threats in Mexico City and provide details about the shooting of US Customs officials south of the border. The analyst adds he “had a rare life-changing” moment when he realized the full potential of HLT programs.
The technology has been deployed in target countries including Iraq and Afghanistan, but the full scope of the NSA’s efforts is far from realized yet. “Once you have this capability, then the question is: How will it be deployed? Can you temporarily cache all American phone calls, transcribe all the phone calls, and do text searching of the content of the calls?” said Jennifer Granick, a civil liberties director at the Stanford Center for Internet and Society. “It may not be what they are doing right now, but they’ll be able to do it.”
“We don’t have any idea how many innocent people are being affected, or how many of those innocent people are also Americans,” Granick said in a message. Vanee’ Vines, an NSA spokesperson, told that the agency “employs a variety of technologies in the course of its authorized foreign-intelligence mission” as part of its efforts to “help to deter threats from international terrorists, human traffickers, cyber criminals, and others who seek to harm our citizens and allies.” However, Vines refused to say whether any privacy and security protections are in place to prevent the conversations of Americans from being scooped up through telephone surveillance.