Google services in some parts of the world were briefly interrupted Thursday morning as a result of a routing error at an Internet service provider (ISP) in India.
As a result, traffic bound for Google from different parts of the world was incorrectly routed through India for some time, causing service delays and interruptions for people on different ISP networks around the world.
Doug Madory, director of Internet analysis at Internet performance company Dyn Research, reported the problem in a blog post Thursday morning. Among the ISPs that accepted the incorrect routing from the Indian ISP were U.S. carriers Cogent and Level 3, France’s Orange, Singapore Telecom and Pakistan Telecom, Madory said.
The problem started when Indian broadband provider Hathway incorrectly announced transit routes to Google to its Indian transit provider Bharti Airtel, Madory said in an interview with eWEEK.
Hathway, like many ISPs around the world, maintains what is known as a peering relationship with Google and presumably other major content providers, Madory said. Peering relationships allow ISPs to directly exchange traffic with companies like Google without having to go through an intermediary transit provider. In this case, for example, Hathway’s peering relationship with Google allows it to directly exchange traffic with Google using specific Internet routes between the two companies. Such private routes are typically faster and cheaper for the ISPs and are meant only for the partners in a peering arrangement.
“In this case, the Indian provider leaked those routes to one of their transit providers,” which, in turn, announced the routes to the rest of the Internet, Madory said.
Not all ISPs accepted the leaked routes, but in situations where they did, traffic bound for Google was redirected through the Airtel and Hathway networks.
“What was happening was that it greatly increased latency,” for people trying to access a range of Google services,” Madory said. “So stuff was timing out. Perhaps the ISP was getting overwhelmed.”
The disruption lasted only about 20 minutes before it was corrected, but it nevertheless highlights the problems that can arise when private routing information between Internet peers enters the public Internet, Madory said. The result can be Internet misdirection on a global scale, he noted, pointing to previous incidents where similar routing leaks have caused major problems.
In one incident, traffic to Microsoft was severely affected as a result of a similar leak of routing information by an ISP in Australia. In that case, people trying to access Microsoft services from different parts of the world were routed through the Australian ISP, and delays and service disruptions resulted.
In another incident, a routing leak by a Pennsylvania-based hosting company ended up disrupting traffic in such faraway regions as Pakistan and Bulgaria, Madory said. Sometimes, such errors can remain undetected for days and are the cause for prolonged service degradations, he said.
For content networks like Google, such routing accidents can be problematic, Madory said. “Somebody else has the ability to mess up their traffic. So there’s a bit of trust they need to have with their peers,” when entering peering relationships, he said.
Network monitoring and routing security company BGPMon said that, in all, 336 Google prefixes were affected during Thursday morning’s incident.
“The leaked routes were detected by a few dozen of our European peers at several European Internet Exchanges, including the London Internet Exchange, the Amsterdam Internet Exchange as well as the Moscow Internet Exchange,” Andree Toonk, BGPMon’s manager of network engineering, said in a blog post.
“The list of networks that selected this leaked path to reach Google included a large national telecom provider based in Europe as well as a global Tier1 provider,” he said.
Though Thursday’s hiccup was caused entirely by a third-party, this is the second time in less than a month that Google has suffered a service issue.
In February, the company blamed a software glitch in an internal system for a two-hour service disruption of its Google’s Compute Engine cloud infrastructure hosting service.
Google did not respond to a request for comment on Thursday’s service disruption.