Google Is Watching

Google’s collection of information about Wi-Fi networks may not breach any laws but concerns loom over the company’s attitude to private data.

This article was originally published on 18 May 2010 on newmatilda.com.

Electronic Frontiers Australia and the Australian Privacy Foundation raised concerns last week about Google’s use of its Street View cars to collect identifying information about Wi-Fi networks for use in its geolocation service. While that identifying information is relatively harmless, Google has now admitted that it has accidentally collected data sent by users on unencrypted Wi-Fi networks too.

The first half of this story concerns the identifying information about Wi-Fi networks that Google was trying to collect. To explain the practice, we need to cover some basic Wi-Fi concepts.

Each Wi-Fi network is identified by a human-readable name called an SSID (like ‘My Wireless Network’) and a unique hexadecimal number which is usually assigned by the manufacturer of the Wi-Fi access point and called a BSSID or MAC address (like 00-17-9A-76-CB-A6).

Normally, a Wi-Fi access point will publicly broadcast its SSID and BSSID so that nearby computers can display the Wi-Fi network to users in a list of available networks, though most Wi-Fi access points allow you to disable this broadcast if you want.

In addition to that, the BSSID is always sent together with any data transmitted over the Wi-Fi network. Since multiple Wi-Fi networks can operate in the same space, devices connected to a Wi-Fi network need to be able to distinguish between data meant for their network and data meant for other nearby Wi-Fi networks. The devices do this by tagging transmitted data with the BSSID of the Wi-Fi network to which they are connected.

Finally, it is important that the SSID and BSSID are used in the way described above irrespective of whether the Wi-Fi network is secured with a password (WEP, WPA, or WPA2) or not.

It was Google’s collection of the SSIDs and BSSIDs of Wi-Fi networks around Australia that initially gave rise to privacy concerns last week. What Google did was mount Wi-Fi antennas to the roofs of the cars that drive around Australia taking photographs of the roadside for Google Maps Street View. As these cars mapped each city, they collected packets of data sent over nearby Wi-Fi networks. The idea was to take the SSIDs and BSSIDs from the collected packets of data, and to store them in a database together with the information about the location where the SSIDs and BSSIDs were seen.

Google could then use the collected information to provide a geolocation service to its users. The next time a user wanted to know his or her approximate location, he or she could send the SSIDs and BSSIDs of Wi-Fi networks that were nearby to Google. Google could then look up the SSIDs and BSSIDs in its database, retrieve the location where its Street View cars last saw those SSIDs and BSSIDs, and send that approximate location to the user.

In other words, Google’s geolocation service has the same function as GPS: it gives the user his or her location. However, whereas GPS uses the user’s distance from GPS satellites of known location to estimate the user’s location, Google’s geolocation service uses the distance from Wi-Fi networks of known location.

And there is nothing unique about Google’s geolocation service. There are many other geolocation providers that use Wi-Fi networks this way, such as Skyhook Wireless and Geomena.

Whether the practice poses privacy problems is a bit more complicated. In Australia, the principal privacy legislation is the Privacy Act 1988 (Cth), which regulates the collection, use, and disclosure of ‘personal information’. Personal information is defined as information about an individual whose identity is apparent or can be reasonably ascertained from that information.

Ordinarily, information about the location of a Wi-Fi network with a particular SSID or BSSID would not fall within this definition of personal information because it cannot readily be linked to an individual, although the position may be different with respect to Wi-Fi networks that use a surname or phone number as the SSID. It is because this information does not ordinarily identify an individual that its collection probably does not breach privacy laws, and does not pose a privacy problem for most people.

And it is for that reason that the common concern that you could be located using the information that Google collected about your Wi-Fi network is unfounded. Google does not store your details, it stores the SSID and BSSID of your Wi-Fi network. To get the location of your Wi-Fi network back from Google’s geolocation service, a person would have to supply, at the very least, your Wi-Fi network’s SSID and BSSID. It may be conceivable that such a person would guess the human-readable name or SSID that you have assigned to your Wi-Fi network, but he or she would not be able to guess the corresponding unique hexadecimal number or BSSID. The only way that the person could get that information would be to be within range of your Wi-Fi network, and at that point, the person would already know your approximate location.

Another concern — one with more merit — is that websites that you visit might know what Wi-Fi network you are connected to, or what Wi-Fi networks you are near, and then query Google’s geolocation service to find out your approximate location. The important thing here is that your browser does not send information about what Wi-Fi network you are connected to, or what Wi-Fi networks you are near, to the websites that you visit. Sites that you visit simply do not have access to it. The qualification here is that some browsers now have the ability to send information about your location to geolocation services. However such functionality works on an opt-in basis.

So that is the first half of the story. Things took a turn on Friday, however, when Google admitted that its Street View cars had collected not only SSIDs and BSSIDs as intended, but also some of the data that users sent over nearby unencrypted Wi-Fi networks. As its cars received packets of Wi-Fi data, rather than stripping the SSIDs and BSSIDs out of the packet and discarding the rest, the entire packet was saved and later stored on Google’s servers.

That means that if you were using an unencrypted Wi-Fi network as a Google Street View car drove past your house, a copy of whatever you were doing could have been collected and stored on Google’s servers together with your approximate location. Whether the data can identify you personally would depend on what you were doing at the time it was collected. If Google happened to come by your house as you were sending an email, then it may have collected personally identifiable information about you (the email together with the sender and recipient).

Collection of such data could very well breach the Privacy Act 1988 (Cth) or the Telecommunications (Interception and Access) Act 1979 (Cth), which prohibits the interception of communication, including email, passing over certain networks, including Wi-Fi networks. And quite irrespective of whether any law is breached, the practice is a cause for concern.

Google has explained that the collection of this additional data was a programming error. It maintains that it intended to collect and store only the SSIDs and BSSIDs of the Wi-Fi networks that its cars passed. And I have no doubt that that is true. The additional data is of minimal use to Google, and its deliberate collection would be an order of magnitude more irresponsible than what I would think Google could be.

However, that this additional data was collected in error does not make what happened here any more acceptable. This is the second time this year that Google has taken a cavalier attitude towards privacy.

In February, Google released Google Buzz, a Gmail-based social-networking tool. It quickly came to light that Buzz publicly disclosed the email addresses of people who Buzz users emailed most frequently, among other information, without seeking users’ specific consent first. Many users were caught off-guard when their data was unintentionally disclosed to other parties, like abusive ex-husbands.

Google has since corrected its problems with Buzz, but you cannot help but get a feeling of déjà vu as you read Google’s explanation of how it snared unencrypted Wi-Fi data. Google has now vowed to delete the collected data, and to submit itself to a third-party audit to verify that deletion — which was the right thing to do. And it has gone as far as to stop using Street View cars to collect Wi-Fi networking information altogether.

But in light of Google’s recent track record in safeguarding privacy, it would be wise for people to begin questioning what data they disclose to Google. Where people disclose data — whether by entering a search term in Google Search, sending email via Gmail, or broadcasting something as an SSID to the public — it is important that they understand how that data could be used, so that they question how that data is used.