Rob Schoon was born in Indiana, interned for On the Media, and is now a freelance writer on media and culture living in Brooklyn. His twitter is @rkschoon.
Why Google Street View Stole Data
Monday, April 30, 2012 - 04:08 PM
About a year ago, OTM talked with Ars Technica’s Nate Anderson about the wifi-snooping tendencies of Google’s Street View. The project was not only taking panoramic pictures for the Google Maps application, but also picking up data from the unsecured wifi signals it came across.
Since the story became public, Google has given several excuses for its activities: it was meant to facilitate Android geo-location services (Anderson told us this is partly true); it was a mistake, a left-over bit of code that accidentally grabbed too much data; or that it was the unofficial initiative of a lonely “rogue” engineer.
On Saturday, Google released an un-redacted version of the F.C.C. report that contradicted their previous claims to innocence. The report confirms that Google’s Street View data collection code was documented and known to several employees and a senior manager. It was “intended to collect, store and review payload data for possible use in other Google projects.”
So what's “payload data”?
It’s anything being transferred between a user’s computer or smartphone and the wifi router. This includes a lot of stuff. According to the report, “ for more than two years, Google’s Street View cars collected names, addresses, telephone numbers, URLs, passwords, e-mail, text messages, medical records, video and audio files, and other information from Internet users in the United States.”
Most of the press has understandably focused on Google’s invasion of privacy, and its mishandling of the truth. But why did Google care about amassing a huge database of random snippets of data from unsecured wireless routers? Keep in mind, most of this data was collected by car, moving down the street at speed, snooping in on any open wifi signals for a few seconds at a time.
What, exactly, was this “payload data” good for?
According to the LA Times, an engineer reviewed some of the data to identify frequently visited websites, in an attempt to figure out how many people were using Google search, but a member of the Google search team told him that was of little value.
Michael Turk, of DigitalSociety.org, speculated in 2011 that Google might have been trying to aggregate enough information about people’s connection speeds and internet service providers “to map out broadband service areas and capabilities.” Google was basically trying to create a broadband map – which “could give you the upperhand in competition.” But, as Turk acknowledges, Google isn’t interested in competing with ISPs. It’s not a broadband company.
Anthony Wing Kosner, at Forbes, took a stab at it yesterday, in his piece, “Street View Payload Data Irresistible Scientific Opportunity for Google.” His chocks it up to Google’s (otherworldly) scientific bent:
Imagine that you are a scientist in charge of a mission to another planet that is home to an advanced civilization. You are charged with mapping the surface of the planet and, while you are at it, you have the opportunity to collect electronic data about the inhabitants’ behavior. You don’t know what use future researchers might make of that data, but the question is, would you collect this “payload data”?
Clay Shirky expanded on this idea in an interview today. “There is no such thing as random data from Google’s point of view,” he said. With enough snippets, one could identify the person whose data it is, but it’s very unlikely that Google was interested in any individuals – Street View was a gigantic, sweeping program, after all.
Shirky says that there are certainly some possible near-term uses for the payload data, like geo-location or advertising algorithms which seek out correlations between services and products you identify in the payload data. “In aggregate, the data is useful if I could say people who use X email provider use Y router brands – you can tailor ads for that,” says Shirky.
But the main reason why Google would collect so much information is that it's just what the company does, almost to a compulsion. Google believes that “the answer to bad data is more data,” says Shirky. Google began as a company interested in finding new data correlations. Google Search’s revolutionary page rank system was based on link data that had been ignored by others before.
“Google is the first company to be born in an age of practically unlimited storage,” says Shirky. “The same way Facebook will always encourage people to over-share, Google will always try to collect as much data as possible.”
Even if Google really did collect the data with no idea of how it would use it, that still seems like an invasion of the privacy. But Shirky feels the data collection was probably based on Google’s default approach to anything and everything: “We assume it's nefarious, when they’re just acting on their cultural DNA.” If Shirky’s right, Google may be more of the alien scientist than Big Brother, but it gives us little reason to expect a change in methods from Google in the future.
According to The New York Times, a Google spokeswoman has said the company now has much stricter privacy controls than it used to, partly due to the Street View controversy.