AOL Screw-up Causes Search Data Spill

Facing a firestorm of privacy-related controversy, AOL blames an internal "screw-up" for the release of keyword search data for 658,000 users.

AOL on Aug. 7 blamed an internal "screw-up" for the embarrassing release of detailed keyword search data for roughly 658,000 anonymized users.

Dulles, Va.-based AOLs mea culpa comes in the midst of a firestorm of criticism from privacy advocates that the information—which amounts to about 20 million search queries—could be traced back to AOL users.

Already struggling to repair its image with users, the data spillage is another black eye for the Internet division of Time Warner.

"This was a screw-up, and were angry and upset about it," AOL spokesperson Andrew Weinstein said in a statement sent to eWEEK.

Weinstein said the release of the data was an innocent attempt to reach out to the academic community with new research tools.

"It was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant," he added.

The data, which has been mirrored on multiple Web sites, represented a random selection of searches conducted over a three-month period (March to May 2006) and includes a numbered User ID, the actual query, the time of the search and the destination domain visited.

The searches only included U.S. searches conducted within the AOL client software, Weinstein said.

The AOL usernames were changed to random strings, but the data associated with each search is matched to the number.

In some cases, there is a theoretical possibility that the search queries could be used to personally identify an AOL user.

"Although there was no personally identifiable data linked to these accounts, were absolutely not defending this. It was a mistake, and we apologize. Weve launched an internal investigation into what happened, and we are taking steps to ensure that this type of thing never happens again," Weinstein said.

Weinstein acknowledged that vanity searches, where users enter their own names into search engines, can sometimes lead to a privacy risk.

He said the total data set released covered roughly 1.4 percent of search users in May 2006, or about one-third of 1 percent of the total searches conducted through the AOL network over that period.

"AOL is in damage control mode—the fact that they took the data down shows that someone there had the sense to realize how destructive this was, but it is also an admission of wrongdoing of sorts," said TechCrunch blogger Mike Arrington.

"The utter stupidity of this is staggering. AOL has released very private data about its users without their permission. While the AOL username has been changed to a random ID number, the ability to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to," he added.

In some cases, Arrington said the data included personal names, addresses and Social Security numbers.


Check out eWEEK.coms for the latest news, views and analysis on enterprise search technology.