The scientific and tech communities for many years have been talking about the much-anticipated evolution to exascale computing. Now governments and vendors have spent billions of dollars developing hardware and software innovations that will make up the infrastructure of exascale systems that promise to be 50 times faster than some of the most powerful supercomputers currently in operation.
Increasingly complex applications such as big data analytics and machine learning along with slower processor performance improvements under
Moore’s Law have put a premium on developing new computer architectures that can handle workloads that are far beyond what most systems can offer today.
The rise of workloads like data analytics and machine learning have added momentum to the effort to develop exascale computing, which is defined as computer systems capable of performing at least one exaFLOPS, equal to a billion billion calculations per second. The first petascale computer was introduced in 2008, and exascale computing would provide a thousand-fold increase over that 10-year-old computer architecture.
Now, after years of discussion, hype and planning, exascale-capable systems are on the horizon. The Chinese government has rapidly increased its investment in exascale computing efforts, which includes three projects that are currently underway. One of those projects involves the development of a prototype called the Tianhe-3, which is scheduled to be ready by next year.
The United States, driven by the year-old Exascale Computing Project (ECP), has plans underway to unveil the first of its exascale-capable systems in 2021, followed by two more two years later. Meanwhile, governments and vendors in the European Union and Japan also have exascale efforts in the works.
While it’s still unclear exactly what these systems will look like or what new technologies they’ll implement, they promise to address the rapid changes underway in high-performance computing (HPC) and enterprise computing.
They will enable more detailed and complex simulations of everything from oil and gas exploration and climate change to drug development, genomics research and the ability to better store, process and analyze the massive amounts of data that is being generated by the internet of things (IoT) and the proliferation of mobile devices. Exascale computing is also sure to bring new advances in the fields of machine learning and artificial intelligence (AI).
The drive to exascale is also fueling a high-stakes international competition among a number of countries around the world, particularly the United States and China. The country that can lead the exascale race will have an advantage in everything from military and scientific research and development to business innovation.
Right now the United States continues to hold an edge in technology innovation, but China is putting a lot of money and manpower into growing its in-house capabilities to drive exascale development. At the same time, there is concern among some in the U.S. HPC community about the impact of possible budget cuts to the Department of Energy (DoE) under the Trump administration. The DoE is the primary sponsor of exascale development in the United States.
However the competition plays out and what the systems end up looking like, there’s little question that a shift to exascale computing is needed. The world has lived through a golden age of computing architecture driven in large part by relentless march of Moore’s Law.
Computers have become more powerful and the cost of computing has declined sharply over the past several decades. The industry has gone from rooms full of vacuum tube computers of the 1950s with the processing power of a hand-held calculator to enormous supercomputers stuffed with thousands of speedy multi-core CPUs, GPUs and other accelerators assembled on compact systems-on-a-chip components.
However, Moore’s Law is starting to nudge up against the physical limits of chip technology while the engineering challenges and costs of continually putting more transistors into ever-smaller spaces continue to mount
Meanwhile, computing workloads are becoming more complex and the amount of data generated each year is growing. There also is a need to bring computing out to the network edge and closer to the systems and devices generating the data. New computing architectures and new ways of thinking about those architectures are required.
“We’re standing on the shoulders of giants and we see where we need to go, and it’s a long way,” Paul Teich, an analyst with Tirias Research, told eWEEK.
According to the most recent Top500 list of the world’s fastest supercomputers released in November 2016, the number-one system was China’s Sunway TaihuLight, at 93 petaflops. The performance of the upcoming exascale systems promise to dwarf that of TaihuLight.
Such performance is going to be crucial in the near future, according to Teich. The simulations researchers are running—mapping the human genome to studying the global impact of climate change—are becoming increasingly complex.
Scientists and enterprises want to make greater use of big data analytics that involves processing petabytes of data to derive useful information in almost real-time. This effort will increasingly involve deep learning and artificial intelligence. Teich cited the example of getting traffic to move efficiently through a smart, connected city in ways that accounts for toll roads, intersections, traffic lights, pedestrian crossings, road closures, traffic jams, and the like.
“Managing the flow of traffic and people is huge,” he said, noting that engineers and researchers have a challenge in figuring out how to put these exascale systems together. “The problem for us is on the design side. Our products and infrastructure are getting more complex. We need to be able to model more complex infrastructure.”
Designing exascale systems is an exercise in architectural balance. According to officials with the ECP, not only must computer scientists consider the hardware architecture and specifications of next-generation supercomputers, but they also need to look at what will be needed in the software stacks that will drive them.
They need to look at the applications that will run on top of the supercomputers to ensure that they can be productively used by businesses, academic institutions and research centers. Engineers and architects are looking not only at the processors that will power the systems, but also the memory and storage technologies that are required to efficiently manage the enormous input and output of data.
Exascale systems will consist of large numbers of compute nodes, will be highly parallel and will have to be highly energy efficient, fitting into a power envelope of 20 to 30 megawatts.
Teich said there is a broad range of technologies that may play a role in future exascale systems, though many—such as optical computing and optical interconnects, graphene as a possible replacement for silicon, and quantum computing—aren’t mature enough for practical application. Some may not be ready for a decade or more.
Some vendors also are working on their own high-performance systems, in particular Hewlett Packard Enterprise (HPE) and The Machine, a system announced three years ago that will include such emerging technologies as silicon photonics, custom processors, a new operating system and the advanced memristor memory technology.
The Exascale Computing Project is working with a number of high-profile technology companies—including Intel, IBM and Nvidia—in the development of the exascale systems and will announce that other companies are joining the effort in the near future.
The first system, scheduled for 2021, will be based on what ECP officials are calling an “advanced architecture,” though it’s unclear what that will entail. The follow-up systems are set to be delivered in 2022 and deployed in 2023.
The drive to exascale computing is an international endeavor that has accelerated what already has been an active competition among the world’s leading industrialized nations. Japanese tech giant Fujitsu is developing an exascale system called the Post-K supercomputer that will be based on the ARM architecture rather than the company’s own SPARC64 fx processors.
Seven European Union countries—Germany, France, Italy, the Netherlands, Luxembourg, Spain and Portugal—in April announced the creation of EuroHPC, a program designed to drive the development of next-generation supercomputers.
However, the most closely-watch competition is between the United States and China. For decades the U.S. has been the world’s technology leader, which has been a key factor in the country’s economic and military success.
But the next few years will determine whether the center of technological power will shift to the eastern hemisphere. Leading in the exascale world is not only a point of national pride, but also important to a country’s global standing in the coming decades. The reason the U.S. “has done so well in the post-World War II world is [that] we have been the pioneers of the microprocessor technology,” Teich said.
ECP officials have said that the United States continues to lead the world in computing technology, but that position is not assured going forward since China is aggressively developing its exascale capabilities.
In 2011, then-President Barack Obama asked Congress to spend $126 million on its exascale projects, and in 2015 signed an executive order creating the National Strategic Computing Initiative to coordinate efforts of the federal government, public and private sectors to create a long-term strategy for the development of advanced HPC systems.
However, there is worry among some in the HPC community that the exascale efforts will be hindered by deep cuts proposed by the Trump administration to the DoE budget, though there’s no guarantee that Congress would agree to such cuts.
However, while ECP officials acknowledge that China is spending a lot of money on exascale projects, there are other advantages that nation has over the United States. For example, the United States is working with vendors that need to build systems that can be commercially successful and used by a broad array of organizations.
Unlike those in China, these U.S. systems also need to be able to run both new and legacy applications. In addition, they need to include components that can be used in other commercial systems, from other HPC machines to consumer devices.
While the first exascale system in the United States that will appear in 2021 will be based on an advanced architecture, the hope is succeeding systems will not need such a radical approach, making them more commercially viable. In addition, the software stacks need to be usable by a wide range of developers, rather than just a small number of programmers with narrow skills sets.