Google over the past 10 years has been building out its own homegrown networking infrastructure that it needed to move the massive numbers of workloads between the hundreds of thousands of servers running in its data centers.
The tech industry knew that Google was developing its own networking technology, and that work helped fuel the software-defined networking (SDN) movement that is quickly reshaping the networking market today. However, for the most part, Google engineers kept secret the details of what they were building, seeing their networks as a competitive advantage in a hyperscale industry that includes the likes of Microsoft, Facebook and Amazon.
However, Google officials have recently begun to open the curtains on their networking technology. Amin Vahdat, a Google Fellow and technical lead for networking at the search giant, in June wrote a post on the company blog outlining Jupiter, the fifth generation of Google’s data center networking infrastructure. Earlier this week, Vahdat wrote another blog post giving more information about the technology and laid out the technical details in a paper presented at the ACM SIGCOMM 2015 show in London.
Google presented three other papers at the event that dealt with bandwidth management and data center network topologies.
The papers and Google’s moves to be more open about its networking efforts come as SDN and its cousin, network-functions virtualization (NFV), continue to alter how organizations and vendors think about how networks are built and deployed. They also come as Google grows out its own cloud platform and wants third-party developers to build services for it.
“We are excited about being increasingly open about results of our research work: to solicit feedback on our approach, to influence future research and development directions so that we can benefit from community-wide efforts to improve networking, and to attract the next-generation of great networking thinkers and builders to Google,” Vahdat wrote in his blog post.
The company’s work on the Google Cloud Platform “further increases the importance of being open about our infrastructure,” he added. “Since the same network powering Google infrastructure for a decade is also the underpinnings of our Cloud Platform, all developers can leverage the network to build highly robust, manageable and globally scalable services.”
What they’ll have access to is a network that is significantly faster than it was 10 years ago. According to Vahdat, Jupiter has 100 times more capacity than the first generation, providing more than 1 petabit per second of total bisection bandwidth. That translates to 100,000 servers being able to communicate at 10G bps in an arbitrary pattern, he said.
This capacity has been a boon for Google engineers. It’s freed them from having to optimize their code for disparate levels of bandwidth, enabled applications to scale beyond what they could have and increased the efficiency of Google’s compute and storage infrastructure, Vahdat wrote.
“Scheduling a set of jobs over a single larger domain supports much higher utilization than scheduling the same jobs over multiple smaller domains,” he wrote.
The 10 years of innovation at Google helped lead to the current push behind SDN, in which software is run on commodity hardware rather than expensive, proprietary switches, Vahdat wrote. Company engineers found that merchant silicon in inexpensive switches arranged in a configuration called Clos could help scale bandwidth requirements in data centers. They then developed a single configuration for network management that was pushed out to each switch, enabling multiple switches to be managed by a single, larger switch.
“Finally, we transformed routing from a pair-wise, fully distributed (but somewhat slow and high overhead) scheme to a logically-centralized scheme under the control of a single dynamically-elected master,” he wrote.