Earlier this year, Google fought a request from the Department of Justice for search records that the DOJ wanted to use as evidence to defend the Child Online Protection Act. At the time, many applauded Googles resistance and its stand (or, at least its perceived stand) for users privacy.
However, there also were many in the media and technology communities who said that the DOJs request was reasonable and that it didnt affect user privacy at all. After all, the information wouldnt include any individual identifier information; it would just be collections of search terms that were entered by users.
Many said, “Whats the big privacy issue here? Its not like you can tell who someone is by what they search for, right?”
Well, that question has been answered clearly by the recent America Online debacle: AOL temporarily posted on the Web detailed information on thousands of user searches. And although AOL eventually took down this information, it was too late to keep it from spreading across the Internet.
And what do you know? It actually isnt that hard to identify someone just by their search information. Several national news outlets have been able to successfully identify individuals based solely on groupings of search terms. One of the main reasons this works is that people like to search for information on themselves or on people they know, not realizing that these “ego searches” are often clear markers for their entire search history.
Doesnt that make you feel all warm and fuzzy inside, knowing that out there in some database, just waiting to be released to the public, is a list of everything youve looked for on search engines?
For me, all this was like a cold splash of water in the face. For one thing, given my profession, there are lots of readers, vendors and public relations workers out there who do searches on my name, and its probably safe to say that some of these people then move on to searches of, well, lets just say less-than-business-related topics.
But there are also searches that Ive done that might not look too good out of context.
Like most people who write for a living, I harbor the desire to write the great American novel. The main characters in one book Ive been working on are bank robbers. Never having robbed a bank myself, Ive done lots of Web searches for books and resources about actual bank robbers and their methods.
This means that there is quite possibly some search record sequence that goes something like, “Jim Rapoza”/”How to rob a bank”/”Techniques of actual bank robbers”/”eWeek Labs”/”True stories of bank robbers”/”Books on real bank robberies.”
In the eyes of a law enforcement person, that sequence of searches could look pretty damning. Now that I think about it, that truck has been parked outside my house for an awful long time. And why does my phone keep making funny clicking noises?
But, really, this is nothing to joke about. Some people might feel smug because they dont use AOL Search, but every search vendor, including Google, keeps similar records of queries. And most of the big search players—including Yahoo, Microsoft and AOL—did give the DOJ the search record data it requested, but didnt get, from Google.
Ideally, the major search vendors would all agree to stop saving query data in this form. They should either keep it only long enough to perform analysis, or they should save it in such a way that searches are all jumbled up, so no two queries from the same person are together. But something like this probably wont happen unless theres a big user outcry.
Me? Im going to try to be a lot more savvy about where and how I search for things. If I decide to write a book on serial killers, for example, I wont do searches on, say, Jeffrey Dahmer and Ed Gein at the same time as other, more personally identifying searches. Im also going to make more regular use of anonymity-protecting tools such as Tor.
Because, when I do a search, Id rather it be a one-way transaction: me looking for data on the search engine rather than the search engine looking for data on me.
Labs Director Jim Rapoza can be reached at [email protected].