A little-known data firm was able to build 48 million personal profiles, combining data from sites and social networks like Facebook, LinkedIn, Twitter, and Zillow, among others -- without the users' knowledge or consent.
Localblox, a Bellevue, Wash.-based firm, says it "automatically crawls, discovers, extracts, indexes, maps and augments data in a variety of formats from the web and from exchange networks."
Since its founding in 2010, the company has focused its collection on publicly accessible data sources, like social networks Facebook, Twitter, and LinkedIn, and real estate site Zillow to name a few, to produce profiles. But earlier this year, the company left a massive store of profile data on a public but unlisted Amazon S3 storage bucket without a password, allowing anyone to download its contents. The bucket, labeled "lbdumps," contained a file that unpacked to a single file over 1.2 terabytes in size. The file listed 48 million individual records, scraped from public profiles, consolidated, then stitched together.
The data was subsequently found by Chris Vickery, director of cyber risk research at security firm UpGuard. Vickery, a well-known ethical data breach hunter, disclosed the leak to Localblox's chief technology officer Ashfaq Rahman in late February. The bucket was secured hours later.
The discovery is the latest twist among recent scandals involving tech companies and their data collection practices. Just last month, Facebook was embroiled in a privacy row after London-based data firm Cambridge Analytica obtained data on as many as 87 million users, according to a "conservative estimate" by the social networking giant, from an academic app that collected data on its users and their friends. The data was used to build profiles on millions of Americans to predict how people will vote at the ballot box, including the 2016 presidential election.
The controversy sparked uproar, triggered congressional and parliamentary inquiries and investigations across the world, and forced Facebook to introduce stronger privacy practices. But the data collection by Localblox can be just as invasive, and can include highly sensitive and personally identifiable information on a person -- without a person's consent.