Anyone wondering just how much network capacity it takes for Google to deliver all of its myriad services might want to consider this: The company’s current-generation data center networks deliver enough bandwidth to read the entire scanned contents of the Library of Congress in less than one-tenth of one second.
Amin Vahdat, a Google Fellow and technical lead for networking, provided a rare glimpse inside the company’s data center networks in a blog post coinciding with the Open Network Summit earlier this week.
According to Vahdat, Google’s current data center network fabric, dubbed Jupiter, can deliver more than 1 petabit per second of total bisection bandwidth, or enough capacity for 100,000 servers to exchange information at 10 gigabits per second per server.
Jupiter is the result of about 10 years’ worth of effort building in-house network hardware and software for linking Google’s data centers and powering the company’s distributed computing and storage systems, Vahdat said. “When Google was getting started, no one made a data center network that could meet our distributed computing requirements,” he explained. From the beginning, the company has realized that computing infrastructures such as Google File System, MapReduce, Bigtable and Borg require a networking capability not available from others, he said.
Over the years, Google has gone through five generations of in-house network technology. Between Firehose, Google’s first in-house data center network, and the current-generation Jupiter network, the company has increased capacity more than 100-fold, Vahdat claimed.
Google’s networks are based on three key principles. All of its networks use a configuration called the Clos topology, where a collection of relatively small and inexpensive switches are arranged in such a manner as to provide the functionality of a much larger logical switch, he said.
The company then uses a centralized management control software stack to manage the thousands of switches within its data centers, making them effectively act as one large network fabric. Google’s third core principle is it builds its own software and hardware using components from vendors. It has relied more on custom network protocols tailored specifically for its data centers rather than on standard Internet protocols.
“Taken together, our network control stack has more in common with Google’s distributed computing architectures than traditional router-centric Internet protocols,” Vahdat said. “Some might even say that we’ve been deploying and enjoying the benefits of software-defined networking at Google for a decade.” The architectures for Google’s massive B4 wide area network and its Andromeda network virtualization stack are based on Google’s early work with data center networking, he said.
Google’s approach to networking has changed the organization of Google’s data, control and management layers, according to Vahdat. The approach has not been without its challenges, however, and Google has had to deploy and redeploy multiple generations of its network technology before getting it right.
Google’s data center networks are highly modular and constantly upgraded to meet the “insatiable bandwidth demands” of the company’s latest-generation servers, Vahdat said. They are managed to ensure near continuous availability and reliability to meet internal performance demands.
“Building great data center networks is not just about building great hardware and software. It’s about partnering with the world’s best network engineering and operations team from day one,” he noted.