Google Identifies DoubleClick Ad Server Problem, Fixes It
A DoubleClick internal service degraded, causing a cascading failure on ad servers for more than 55,000 sites, leading to a two-hour outage.Google said Nov. 14 that it has figured out why its DoubleClick advertising service went wonky the morning of Nov. 12, impacting more than 55,000 Websites displaying their content. The huge Web services provider, in an email to eWEEK and other users, said it has determined a new strategy to prevent it from happening again. The Web-serving mechanism that serves up millions of ads and targets them to specific sites sustained a major outage that slowed ad delivery or blocked access to 55,185 Websites, according to Dynatrace, an application performance management (APM) service provider. Those sites included eWEEK.com and other QuinStreet sites, such as CIO Insight, Baseline, IT Business Edge, Datamation and Enterprise Networking Planet. Others affected were USA Today, the Wall Street Journal, Forbes, BBC.com and YouTube. The Guardian and the Enquirer news sites in the U.K. also were hit.
Google reported that the outage lasted about two hours, from 5:45 a.m. PST to 7:31 a.m. PST.
- "The DFP ad server relies on an internal service that began degrading in performance. This caused a cascading failure on DFP ad servers, leading to the outage."
- "We designed our systems to gracefully handle performance degradation from dependent services. However, due to a misconfiguration, we were unable to prevent the outage."
- "To restore ad serving and prevent cascading failures, we restarted the services by provisioning additional resources."
- "We reproduced the failure in a test by degrading the availability of the internal service, proving the misconfiguration caused the cascading failures. We have since rolled out a fix to the configuration globally."
- "We are conducting a complete review of all our processes and production configurations to prevent this from happening again."