Facebook users around the world had a singular question for much of March 13: Is Facebook down?
As it turns out, the global social media giant and its related Instagram and WhatsApp services were in fact unavailable and down for much of the day. Some service was restored by March 14, though full global availability across all Facebook services is still intermittent. With Facebook down, the company ironically had to resort to using rival social media service Twitter to keep many of its users informed.
"We’re aware that some people are currently having trouble accessing the Facebook family of apps," Facebook wrote in a Twitter message. "We’re working to resolve the issue as soon as possible."
Facebook also provided minimal updates via its platform status dashboard for developers, with the first indication of trouble reported at 10:32 a.m. PT on March 13.
"We are currently experiencing issues that may cause some API requests to take longer or fail unexpectedly," the status page reports. "We are investigating the issue and working on a resolution."
The Downdetector website, where individuals report outage issues, was also flooded with updates from users reporting that they were unable to access Facebook. Users also started to use the hash tag #facebookdown on Twitter to complain about the global outage.
So What's Behind the Outage?
Whenever there is an outage of any sort, the first type of speculation is usually that the downed site was a victim of some form of cyber-attack. While Facebook has not publicly identified the root cause of its outage as of the time of this article's publication, it did note that at least one type of attack did not cause the issue.
"We're focused on working to resolve the issue as soon as possible, but can confirm that the issue is not related to a DDoS attack," Facebook wrote in a Twitter message.
A distributed denial of service, or DDoS, attack is a type of cyber-attack where a victim website is overwhelmed by an unsupportable volume of traffic. The deluge of traffic effectively renders the site unavailable, triggering the denial of service condition.
The largest DDoS attacks on record have surpassed over 1T bps of attack traffic, which can be enough to cripple many types of sites. Facebook isn't an average website or service, however, and operates a massive globally distributed network. A DDoS attack that could take down Facebook would likely require attack bandwidth that to date has never been seen.
Is It a BGP Issue?
Some of the early speculation about the outage involved the Border Gateway Protocol (BGP) routing protocol. BGP is the protocol that is used to help route traffic around the internet and has previously been implicated in different outages.
In an email sent to eWEEK from a PR firm representing Netscout, the firm claimed that an accidental BGP routing leak from a European ISP to a major transit ISP resulted in "perceptible disruption of access" for a period of time. BGP route leakage involving Facebook has in fact happened before. In 2011, eWEEK reported that some Facebook traffic was routed to China due to a BGP routing issue.
Network intelligence platform vendor ThousandEyes reported that it did not see any BGP changes that were affecting Facebook and its related properties. ThousandEyes added that Facebook has its own backbone network and doesn't entirely rely on the public internet.
"Today starting at 9:25 AM PST Facebook users around the globe have had issues reaching their services," ThousandEyes wrote in its analysis. "Our tests show this is very likely an app-layer issue, as we see 500 Internal Server Errors being reported from various locations around the globe. #FacebookDown"
With a globally distributed, complex multiservice architecture, Facebook by design is highly resilient and most days is also highly available. Given that Facebook is a public company and the outage will undoubtably have some form of financial impact, the company will at some point need to publicly disclose what actually led to the outage. Until then, Facebook users and everyone else will likely just have to speculate about what actually went wrong.
Sean Michael Kerner is a senior editor at eWEEK and InternetNews.com. Follow him on Twitter @TechJournalist.