Google, UAL Gaffe Underscores Need for Smarter Web Crawlers (
Page 1 of 2 )
A 2002 Chicago Tribune story on UAL Corp.'s bankruptcy filing popped
up on Google News on Monday, sending shares in the parent company of
United Airlines to $3 from $12.50 on the Nasdaq. The stock rallied
Tuesday (thank heaven) and now sits at $10.50.
The Tribune outlined the happening in a press release. Google outlined it in a blog post here.
Google said its search bot, which crawls pages online and catalogs
content, Sept. 6 discovered a new link on the Web site of Tribune's
South Florida Sun-Sentinel newspaper in a section called "Popular
Stories: Business."
The link did NOT, and this is key, include a dateline, but the Sun-Sentinel
page had a fresh date above the article on the top of the page of
"September 7, 2008" (Eastern). Google added in its post:
Because the Sun-Sentinel
included a link to the story in its "Popular Stories" section, and
provided a date on the article page of September 7, 2008, the Google
News algorithm indexed it as a new story. We removed this story as soon
as we were notified that it was posted in error.
The article then became available through Google News service, which
passed it along to people who had created a custom Google News alert
about United Airlines. By Monday, Sept. 8, the UAL story began
circulating via a post by research firm Income Securities Advisors that
was made available to users of financial news service Bloomberg LP.
Boom! UAL shares plummeted. Now, here's the trick. If human stock
traders were the cause of UAL's stock drop, I would blame them. Even
without the appearance of a date, traders covering UAL would know (we
hope) that the story was old. The sell-off would be avoided.
What can we expect from Google in the next 10 years? Find out here.
Some say the issue could have been avoided if the Sun-Sentinel had
provided a publication date for the original Tribune article, enabling
the Google news crawler to reject the story as irrelevant.
Perhaps, but that's not implicitly true. As a Google Watcher and a
Google Alerts subscriber, I occasionally see articles on Google Alerts
that were published months ago. But because I cover Google extensively
(some would say exhaustively), I have enough historical knowledge to
discern old from new, even without a date on the article.
Unfortunately, human traders weren't the cause of the UAL issue.
Instead, some search crawlers trolling the Web for news based on
headlines and financial data executed stock trades on the fly. Unlike
humans, who consider such metadata as the date and history of UAL, the
machines apparently can't yet parse the deeper meaning behind the
searches.