Google Instant Blacklist Attributed to Imperfect Autocomplete

 
 
By Clint Boulton  |  Posted 2010-09-30
 
 
 

Google Instant Blacklist Attributed to Imperfect Autocomplete


Google chalked up a so-called blacklist of search terms people found banned when using the new Google Instant predictive search technology to an autocomplete system the company is continuing to refine.

Publication 2600 compiled the list of words that won't surface when users search for them using Google Instant, which renders results automatically on the fly.

This list includes single terms such as "amateur," "lesbian" and "bisexual" and compound queries such as "barenaked ladies" (this is a band) and "girls gone wild" (this is a line of reality TV videos where women bare themselves).

Words that aren't blacklisted include "sadist," "feminazi" and "commie." To see results for blacklisted words, users must hit enter. 2600 discussed the so-called Google Blacklist (found via Mashable):

"Like everything these days, great care must be taken to ensure that as few people as possible are offended by anything. Google Instant is no exception. Somewhere within Google there exists a master list of 'bad words' and evil concepts that Google Instant is programmed to not act upon, lest someone see something offensive in the instant results... even if that's exactly what they typed into the search bar. We call it Google Blacklist."

2600 invited users to test the blacklist themselves by first typing in "puppy" and then "bitch," which is a word used to describe a female dog that has been for years used as a derogatory term for, among other things, a disliked female.

Results for "puppy" fill the screen with Google Instant, while "bitch" using the same technology requires users to manually hit the enter key to complete the search.

The point, 2600 argued, is that Google Instant is being used as a tool that can be "filtered, controlled and ultimately suppressed." 2600 punctuated the insinuation that Google is censoring its search by noting:  "It is indeed a good thing that Google isn't evil."

This "filtering" is the way Google's Autocomplete technology has worked for a few years, predating Google Instant, a Google spokesperson told eWEEK.

Google Instant Available in More Countries


As such, the technology is imperfect; Google Instant merely shines a spotlight on the flaws because of its nature of surfacing results for search terms on the fly.

"We apply a narrow set of removal policies for pornography, violence and hate speech," Google said. "It's important to note that removing queries from autocomplete is a hard problem, and not as simple as blacklisting particular terms and phrases.

In search, we get more than one billion searches each day. Because of this, we take an algorithmic approach to removals, and just like our search algorithms, these are imperfect. We will continue to work to improve our approach to removals in Autocomplete and are listening carefully to feedback from our users.

Moreover, Google's algorithms look at compound queries based on those search words across all languages. For example, where there's a bad word in Russian entered in to the search box, Google may remove a compound word including the transliteration of the Russian word into English.

Also, if the results for a particular query seem pornographic, Google's algorithms may remove that query from Autocomplete, even if the query itself wouldn't otherwise violate our policies. This system is neither perfect nor instantaneous, and we will continue to work to make it better."

The filtering question was broached upon Google Instant's launch Sept. 8. Since the launch, millions of people have used the predictive search technology in the last three weeks, with some searchers compiling lists of terms Instant fails to render on the fly.

Meanwhile, Google gave Instant Search a boost Sept. 29 by adding keyboard navigation to help users search without a mouse and making the technology within search categories such as videos, news, books, blogs, updates and discussions.

Google Instant is also rolling out in the domains for 12 new countries in Austria, Belgium, Canada, the Czech Republic, Ireland, Mexico, the Netherlands, Poland, Slovakia, Slovenia, Switzerland and Ukraine.

Rocket Fuel