A vice president for Microsoft’s Bing team denied configuring the Bing algorithm to copy Google search results in an effort to boost search market share.
What Bing does use is “clickstream,” or customer data from its Bing toolbar to improve its search results, said Harry Shum, corporate vice president of core search development at Microsoft.
Shum defended Bing at a search engine roundtable Feb. 1 and accused Google of cultivating “spy-novelesque” propaganda where none truly exists.
The roundtable included Google search quality engineer Matt Cutts, who broached the notion that Bing was copying Google results with Search Engine Land ahead of the “Bing Presents Farsight 2011: Beyond the Search Box” event today.
Cutts explained Google noticed in June 2010 that Bing had the same results for some queries despite that fact that they were misspelled.
The trend continued through December, and Google performed an experiment, having Google search engineers perform searches on 20 Windows laptops loaded with Internet Explorer 8 and with Suggested Sites and the Bing Toolbar enabled.
Cutts said Google created code that would allow it to manually rank a page for a certain term, and then created 100 synthetic searches to test its theory. The fake queries returned no matches on Google or Bing, but Google placed a honeypot page to show up at the top of each synthetic search.
A few weeks later, a small number of Bing search results appeared to copy Google’s results from the synthetic searches. Google concluded that some of the data the Bing toolbar collected clearly culled data about activities users may have been doing via Google.
“Microsoft has said they don’t copy the results, and we have screenshots that make it look very much like that’s what’s happening,” Cutts said to Shum.
Bing VP Defends Microsofts Search Practices
Shum said Bing search results that appeared to mimic Google’s were outliers, or coincidental examples, and denied that Microsoft is copying Google results. Cutts disagreed, noting that Google found the examples of overlap in popular queries.
Shum confirmed that it culls and uses Web browsing data customers as one of more than 1,000 different signals and features in its ranking algorithm.
Shum provided more detail in a blog post published during the roundtable, noting:
“A small piece of that is clickstream data we get from some of our customers, who opt-in to sharing anonymous data as they navigate the web in order to help us improve the experience for all users.”
Shum also said that other search engines rely on this “collective intelligence” practice, an assertion that Cutts “categorically” denied.
Shum, who compared Google’s synthetic searches with spam and click fraud, noted that he wished Google had approached the Bing team with its concerns before airing them to the press and offered to compare search signals and algorithms with Google.
“What we saw in today’s story was a spy-novelesque stunt to generate extreme outliers in tail query ranking,” Shum added in his blog.” “It was a creative tactic by a competitor, and we’ll take it as a back-handed compliment. But it doesn’t accurately portray how we use opt-in customer data as one of many inputs to help improve our user experience.”
The dialogue between Cutts and Shum eventually devolved into a he-said, he-said dead-end, prompting moderator Vivek Wadhwa to move the discussion to spam on search engines.
What’s remarkable about the exchange is that it’s the first very public grievance aired between Google search and Microsoft Bing engineers since Bing appeared in June 2009.
Then again, perhaps it’s not so remarkable considering little has changed in the market since Bing arrived. Bing rallied to gain a few percentage points of share while Google completed 2010 with its highest ever search plot, at 66.6 percent, according to comScore.