Intel Xeon Phi Coprocessors Challenge Nvidia, AMD in HPC Market

By Jeffrey Burt  |  Posted 2012-11-18

Intel Xeon Phi Coprocessors Challenge Nvidia, AMD in HPC Market

Organizations with high-performance computing (HPC) environments over the past few years have increasingly turned to GPU accelerators from the likes of Nvidia and Advanced Micro Devices to ramp up the performance of their supercomputers while keeping power consumption in check.

Now Intel is looking to muscle in on the trend, offering its new Xeon Phi coprocessors to work with traditional CPUs to boost the capabilities of the massive systems while giving organizations the benefits of working within the familiar Intel and x86 environments. 

These accelerators and coprocessors—which work with CPUs to help supercomputers run compute intensive and highly parallel workloads essentially by throwing large numbers of cores at them—took center stage at last week’s SC12 supercomputer show. Both Nvidia and AMD rolled out powerful new GPU accelerators, while Intel unveiled the first of its Xeon Phi coprocessors.

At the same time, the Titan supercomputer—a Cray XK7 system that uses both AMD Opteron server chips and Nvidia’s new K20 GPU accelerators—topped the Top500 list of the world’s fastest supercomputers. Debuting on that list—and coming in at number seven—was the Stampede supercomputer, which comprises Intel-based Dell servers and includes Xeon Phi coprocessors.

With the introduction of the Xeon Phi chips, the competition begins, and some analysts believe the Intel coprocessors could prove a threat to Nvidia’s strong—90 percent—share of the market.

“What this reflects is a fairly common argument between existing and new technologies,” Charles King, principle analyst at Pund-IT Research, told eWEEK in an email. “It really boils down to whether customers and partners (like ISVs and OEMs) will gain performance benefits from new solutions that justify the investments necessary to commercialize those new technologies. On the GPU side, NVIDIA is emphasizing … the performance of systems using its GPUs but not talking as much about the time/cost required to port existing applications to the platform, training programmers and programmers to gain the maximum advantage of the new systems, etc. Intel’s response is along the lines of, ‘What if you could have an alternative that delivers similar or better performance and will run existing apps and code natively?’ That’s a powerful argument for many players, especially those involved in the commercial HPC space.”

Patrick Wang, an analyst with Evercore Partners, in a report Nov. 15, wrote that “the launch of Xeon Phi marks the beginning of a new era and a new antagonist for [Nvidia].” As Intel ramps up the performance of the coprocessors, the problems for Nvidia will only increase, Wang wrote, saying that “the question is not IF but WHEN [Intel will] gain traction” in the market.

Accelerators like GPUs and—now—Intel’s coprocessors are growing in popularity as HPC organizations in industries such as energy, financial services, health care, science and digital content creation increasingly are turning to supercomputers to run their highly parallel workloads. At the same time, system energy efficiency is at a premium, with organizations looking to drive down the power and cooling costs in these increasingly dense, hyperscale data centers. Of the 500 supercomputers on the Top500 list released Nov. 12, 62 used GPU accelerators or coprocessors.

Both Nvidia and AMD over the past few years have been pushing their low-power, many-core graphics technologies as ideal accelerators, with Nvidia grabbing the bulk of the market. However, Intel officials, with their Xeon Phi coprocessors, are looking to make inroads. Already Intel Xeon chips power many supercomputers—they are in 76 percent of the Top500 list systems—and now the vendor is looking to leverage that presence to push its Xeon Phis.

Intel’s coprocessors have been eight years in the making, and are the first products out of the giant chip maker’s Many Integrated Cores (MIC) program. At the SC12 show, Intel unveiled two versions that have 60 or more cores—the Phi 5110P and 3100—that will come out next year, though noting that early customers, such as the Texas Advanced Computing Center (TACC), where Stampede is being built, are using custom models of Phi.

During a recent day-long workshop at TACC for journalists, Intel officials argued that the x86-based Xeon Phi coprocessors are a better alternative for HPC organizations than GPU accelerators. Most programmers already are familiar with the x86 architecture and its tools—from compilers and run environments to debuggers libraries and workload schedulers—and most workloads already are optimized for x86, said James Reinders, director of parallel programming evangelism at Intel. In addition, the x86 coprocessors can run operating systems independent of the CPUs, according to Intel.

Workloads running on Xeon Phis have to undergo significantly less recoding than those running on GPU accelerators, Reinders said. Organizations running highly parallel workloads “don’t need multiple versions of processors for different architectures.”

The Stampede supercomputer, once fully operational, will have a performance of 10 petaflops (quadrillions of calculations per second), and TACC Director Jay Boisseau said that the Xeon Phi coprocessors will account for about 70 percent of the performance. During the workshop, Boisseau reiterated the benefits of having accelerator technology based on x86.

“X86 has been around a long time, and people are pretty familiar with the architecture,” he said, adding that while GPU accelerator technology is good, “programmability is a problem.”

Intel Xeon Phi Coprocessors Challenge Nvidia, AMD in HPC Market

The same day Intel announced the Xeon Phi chips, Nvidia and AMD both unveiled the latest generations of their respective GPU accelerators. Nvidia announced its 28-nanometer Tesla K20 and K20X GPUs, the first based on the Kepler architecture. The Cray-based Titan supercomputer boasts 560,640 processors, including 261,632 of Nvidia's K20x GPU accelerators. Titan offers a performance of 17.59 petaflops, of which the Nvidia accelerators account for about 90 percent.

AMD unveiled its FirePro S10000 GPU accelerator on Nov. 12, with officials noting that it is based on the company’s Graphics Core Next architecture, which enables the GPU to simultaneously boost power for both compute and graphics workloads.

Officials with both Nvidia and AMD dismissed Intel’s Xeon Phi technology. Sumit Gupta, general manager of Nvidia's Tesla Accelerated Computing unit, said Intel is several years behind in accelerator technology, and noted that the Xeon Phi coprocessors barely beat Nvidia’s previous Fermi GPUs in performance and power consumption.

“Their new product is in the same ballpark as our three-year-old product in efficiency, so they’re very behind schedule in energy efficiency,” Gupta told eWEEK.

He also pushed back at the idea that being based on the x86 architecture gives Intel any advantages. While the coprocessors may be able to run the same languages and tools that traditional Xeon CPUs can, Gupta pointed to Nvidia’s CUDA programming language, which works in C/C++ or Fortran and supports OpenACC tools. Overall, 395 million CUDA GPUs have shipped, and CUDA has been downloaded 1.5 million times, he said. In addition, it is being taught in 62 countries, and systems from the likes of Cray, Hewlett-Packard, IBM, SGI and Asus are becoming available with the K20 and K20X GPUs.

John Gustafson, CTO of AMD’s graphics business, echoed what Gupta said, noting that Intel is “behind the wave with what we do.” AMD, which relies primarily on the OpenCL programming language, has been offering GPU accelerators for more than three years, giving the company a head start in the field. In addition, the idea that the x86 architecture gives Intel an advantage over GPU accelerators also doesn’t make sense, Gufstafson told eWEEK. No matter whether they use the GPUs or Intel coprocessors, programmers still have to recompile their software, he said.

“We all have to change our codes,” Gufstafson said.

AMD’s Opteron server chips are based on the x86 architecture, but the company also ensures that organizations that want GPU capabilities can find them with AMD. (AMD is expanding that idea into its server chip business, announcing late last month that it will start making ARM-based server chips in 2014.)

“What AMD does is offer the right tool for the right job,” Gustafson said, pushing back at Intel’s insistence that x86 is the right architecture for any job. “When you’re a hammer, everything looks like a nail.”

Cray has used AMD Opteron chips and Nvidia GPU accelerators for years, but earlier this month announced that the first supercomputers in its next-generation XC family of systems will run on Intel’s Xeon processors and will leverage both the Xeon Phis and Nvidia’s Tesla GPUs. Barry Bolding, Cray's vice president of corporate marketing, told eWEEK that over the years, it’s been proven that GPUs can make applications very fast. The Xeon Phi coprocessors look promising, but the question now is, “can you get the same performance out of them that you can with GPUs? We believe you can.”

Pund-IT analyst King agreed, and said Intel’s Xeon Phi technology could prove to be a tough competitor for GPUs if the performance is good.

“Right now, GPUs have captured a good deal of attention in the research/university HPC space,” King said in an email to eWEEK. “That’s great from a mindshare perspective, but it’s a long road to commercial validation, let alone success. I believe a lot of what comes next depends on how Intel’s coprocessors measure up in overall price/performance to GPUs. If they beat GPUs or even come close, commercial HPC vendors are likely to stick with x86. Cool technologies aside, at the end of the day commercial vendors are trying to make a living. The most successful vendors are those who understand that point and do all they can to support it.”

Rocket Fuel