A data analytics contractor employed by the Republican National Committee (RNC) left databases containing information on nearly 200 million potential voters exposed to the internet without security, allowing anyone who knew where to look to download it without a password.
"We take full responsibility for this situation," said the contractor, Deep Root Analytics, in a statement. The databases were part of 25 terabytes of files contained in an Amazon cloud account that could be browsed without logging in. The account was discovered by researcher Chris Vickery of the security firm UpGuard. The files have since been secured.
Vickery is a prominent researcher in uncovering improperly secured files online. But, he said, this exposure is of a magnitude he has never seen before. "In terms of the disc space used, this is the biggest exposure I've found. In terms of the scope and depth, this is the biggest one I've found," said Vickery.
The accessible files, according to UpGuard, contain a main 198 million-entry database with names, addresses of voters and an "RNC ID" that can be used with other exposed files to research individuals. For example, a 50-gigabyte file of "Post Elect 2016" information, last updated in mid-January, contained modeled data about a voter's likely positions on 46 different issues ranging from "how likely it is the individual voted for Obama in 2012, whether they agree with the Trump foreign policy of 'America First' and how likely they are to be concerned with auto manufacturing as an issue, among others."
That file appears in a folder titled "target_point," an apparent reference to another firm contracted by the RNC to crunch data. UpGuard speculates that the folder may imply that the firm TargetPoint compiled and shared the data with Deep Root. Another folder appears to reference Data Trust, another contracted firm.
UpGuard analyst Dan O'Sullivan looked himself up in the database and writes in the official report that the calculated preferences were, at least for him, right on the money. "It is a testament both to their talents, and to the real danger of this exposure, that the results were astoundingly accurate," he said. The Deep Root Analytics cloud server had 25 terabytes of data exposed, including 1.1 terabytes available for download.
Over the 2016 election season, the RNC was a major client of Deep Root, one of a handful firms it contacted for big data analysis. Firms like Deep Root Analytics use data from a variety of sources to extrapolate social and political preferences of voters to determine how best to market to them.
The RNC spent $983,000 between January 2015 and November 2016 for Deep Root's services and $4.2 million for TargetPoint's. "Deep Root Analytics builds voter models to help enhance advertiser understanding of TV viewership. The data accessed was not built for or used by any specific client. It is our proprietary analysis to help inform local television ad buying," said Deep Root Analytics in their statement.
Misconfigured cloud servers and online databases are a common way for data to be accidentally left exposed to the public. Vickery has found everything from military engineering plans to databases of believed terrorists in exactly this way.
What is uncommon in this case is the size and scope of this exposure. If its records are accurate, the Deep Root Analytics exposure contains information on more than half of the American population. It dwarfs the second-largest exposure of voter information — 93.4 million records of Mexican citizens — by more than 100 million voters and tops the largest data breach of voter information — 55 million records of Philippine voters — by more than 140 million.
Anyone who knew the files' web address could have accessed them. But without that knowledge, they are much harder to find. Even armed with a search for unsecured databases, finding exposures of any magnitude is tough work. Vickery sifts through a large number of unsecured databases to find ones that interesting enough to publish research.
Deep Root has contracted the security firm Stroz Friedberg to perform a thorough investigation of the exposure. The exposure, between June 1 and June 14, was sealed shut shortly after Vickery made the discovery during the night of June 12 and notified relevant regulatory bodies.
Download SafeUM — communicate privately, without advertising and spam.