Microsoft: Zero Data Retention Not Possible to Keep Search Engines Viable

By Clint Boulton  |  Posted 2008-12-18

Microsoft: Zero Data Retention Not Possible to Keep Search Engines Viable

Yahoo's reduction of its duration for user log data retention has some industry watchers calling for Google and Microsoft to do the same and predicting that pressure from government regulators' will lead to zero retention policies in search next year.

Brendon Lynch, director of privacy strategy at Microsoft, told eWEEK zero retention policies are just not possible for Microsoft without reducing the quality of its Live Search offering, among other issues.

The issue sparked Dec. 17 when Yahoo pledged to reduce the period it saves the user log data its search engine gathers -- user queries, IP addresses and cookies that create digital trails -- from 13 months to 3 months. Yahoo, Google and Microsoft argue that data about users is necessary to provide quality search, protect users from malicious users and scam artists.

The move by Yahoo, the No. 2 search engine provider, is easily the most aggressive to data. Search leader Google pared its data retention period from 18 months to 9 in September. No. 3 player Microsoft has been stuck at 18 months since July 2007, though it has said it would be willing to go down to six months if Google and Yahoo agreed to do the same.

Yet Yahoo's move was received with cautious praise by some privacy advocates who believe Yahoo, Google and Microsoft can do better. Peter Eckersley, staff technologist with consumer rights group Electronic Frontier Foundation, told eWEEK:

This looks like an attempt by Yahoo to keep a lot of information that they can use for their own internal research and engineering purposes, while being able to say "it would be extremely hard for us to find your search history file in this huge stack of search history files that we keep". That's a big step in the right direction.

However, Eckersley noted that Yahoo still retains 24 of the 32 digits of users' IP addresses, which means that if Yahoo had someone's IP address, and wanted to find their search history, it could dig out fifty or a hundred files and say that one of them belongs to that person. A human, or more likely a statistical analysis program, could then read them and match a file to that person.

John Simpson, a privacy advocate for the non-profit consumer rights group Consumer Watchdog, said no less than a zero retention policy will suffice, arguing that since most users of Google or Yahoo return daily they are constantly providing a new stream of personal data. His group wants users to have the option to control their data and browse anonymously.

But Microsoft's Lynch said the search data Live Search collects has a number of uses. In addition to analyzing users' search queries to improve query relevancy, Lynch said user log data helps Microsoft Live Search thwart security threats, keep people from gaming search ranking results, and combat click fraud scammers.

Microsoft Privacy Guru States His Case

After evaluating the issue, Microsoft concluded earlier this month that a 6 month retention policy is feasible. If Google, Yahoo and Microsoft agreed to a 6-month timeframe it would keep the playing field level for the search engines. "The company that has more data to analyze has the greater ability to improve the relevance of the search engine," he said.

Does that mean Microsoft would match Yahoo's new 3 month policy? Lynch wouldn't bite, noting only that Microsoft constantly reviewing what is an acceptable time frame. "Ultimately what we're looking for is a common approach across the industry and not just on timeframe."

But  months is unlikely to satisfy privacy advocates. EFF's Eckersley said Yahoo is clearly doing a better job on this issue than Google, which in most cases could look up a person's search history very easily for 18 months because it still keeps cookie IDs for 18 months, and hasn't announced any deletion of "giveaway" searches for things like names and phone numbers.

"A gold star to Yahoo, and a gold star to the European Union for scaring the search engines into offering Internet users more privacy," Eckersley said.

Consumer Watchdog's Simpson called on Google to give its search users control over their private data; transparency about how their data is gathered and used; and the right to give informed consent through opt in functions, rather than having to sift through pages in order to opt out.

For its part, Google is content with its current policy, which the company halved from 18 months to nine in September.

"When we make changes to our policies, they are dependent on what will be best for our users both in terms of the services we provide and the respect of their privacy. It is a balance that we are continually evaluating," wrote Jane Horvath, senior privacy counsel at Google, in an e-mail sent to eWEEK.

Microsoft, Google and Yahoo are expected to make their cases in presentations in February to the European Commission's Article 29 Working Party, an advisory panel comprising data protection commissioners from each of its 27 member countries.

The meeting will be the latest battle in the tug of war between the government body, which is bent on protecting users' privacy, and the search providers, which are determined to store data to improve their services.

The European Commission has been more aggressive toward regulating search engines to date. With the installation of U.S. President Barack Obama, it is unclear how this fight will evolve in 2009 at home and abroad.


Rocket Fuel