Microsoft: Zero Data Retention Not Possible to Keep Search Engines Viable (
Page 1 of 2 )
Yahoo's reduction of its duration for user log data retention has some industry watchers calling for Google and Microsoft
to do the same and predicting that pressure from government regulators'
will lead to zero retention policies in search next year.
Brendon Lynch, director of privacy strategy at Microsoft, told eWEEK
zero retention policies are just not possible for Microsoft without
reducing the quality of its Live Search offering, among other issues.
The issue sparked Dec. 17 when Yahoo pledged
to reduce the period it saves the user log data its search engine
gathers -- user queries, IP addresses and cookies that create digital
trails -- from 13 months to 3 months. Yahoo, Google and Microsoft
argue that data about users is necessary to provide quality search,
protect users from malicious users and scam artists.
The move by Yahoo, the No. 2 search engine provider, is easily the most
aggressive to data. Search leader Google pared its data retention
period from 18 months to 9 in September. No. 3 player Microsoft has
been stuck at 18 months since July 2007, though it has said it would be
willing to go down to six months if Google and Yahoo agreed to do the
same.
Yet Yahoo's move was received with cautious praise by some privacy
advocates who believe Yahoo, Google and Microsoft can do better. Peter
Eckersley, staff technologist with consumer rights group Electronic
Frontier Foundation, told eWEEK:
This looks like an attempt by Yahoo to keep a lot of information
that they can use for their own internal research and engineering
purposes, while being able to say "it would be extremely hard for us to
find your search history file in this huge stack of search history
files that we keep". That's a big step in the right direction.
However, Eckersley noted that Yahoo still retains 24 of the 32
digits of users' IP addresses, which means that if Yahoo had someone's
IP address, and wanted to find their search history, it could dig out
fifty or a hundred files and say that one of them belongs to that
person. A human, or more likely a statistical analysis program, could
then read them and match a file to that person.
John Simpson, a privacy advocate for the non-profit consumer rights
group Consumer Watchdog, said no less than a zero retention policy will
suffice, arguing that since most users of Google or Yahoo return daily
they are constantly providing a new stream of personal data. His group
wants users to have the option to control their data and browse
anonymously.
But Microsoft's Lynch said the search data Live Search collects has a
number of uses. In addition to analyzing users' search queries to
improve query relevancy, Lynch said user log data helps Microsoft Live
Search thwart security threats, keep people from gaming search ranking
results, and combat click fraud scammers.