Appro Combines Intel, Nvidia Chips in HPC Cluster

Appro's HyperPower Cluster pulls together Intel's Xeon 500 Nehalem EP chips and Nvidia's Tesla GPU into a single solution aimed at improving the performance of HPC environments while lowering the costs. Nvidia's efforts to bring its GPU technologies into the mainstream come as rival chip makers AMD and Intel also are working on ways to merge their own CPU and GPU technologies.

Appro is rolling out a high-performance computing cluster that combines Intel's Nehalem server chip and Nvidia's Tesla graphics processor.

Appro's HyperPower Cluster, announced May 18, is the latest move by systems makers to pull together CPUs and GPUs to offer improved computing performance at lower costs and with greater energy efficiency.

"There's always been a need in HPC to find creative ways to run code faster, and there's always been an interest in specialized CPUs," John Lee, vice president of advanced technology solutions for Appro, said in an interview.

However, that hasn't caught hold because of the difficulty in scaling a specialized chip industry to meet the demand, Lee said. Appro officials have been looking at the idea of graphics processing units for several years, but it wasn't until now that it made sense, Lee said.

He gave much of the credit to Nvidia, which has been aggressive in pushing its GPUs into the mainstream computing space.

"Nvidia is taking such a leadership role," Lee said. "They're driving demand that way."

Others also are moving in that direction. Advanced Micro Devices, which bought GPU making ATI for $5.4 billion in 2006, announced May 6 that it was merging its chips and graphics businesses, bringing the ATI unit fully into the AMD fold.

During AMD's annual stockholders meeting a day later, President and CEO Dirk Meyer said that combining the company's CPU and GPU businesses was a key differentiator for the company going forward.

"Only two companies in the world can develop and deliver in volume leading-edge x86 processor solutions," Meyer said during his talk. "Only two companies in the world can develop leading-edge graphics, and only one company-and that is AMD-has the ability to do both."

In addition, Intel is working on offering integrated graphics in its upcoming CPUs, and is working on its own GP-GPU chip, codenamed "Larrabee."

However, both AMD and Intel have a way to go before they catch up with Nvidia in the GPU space.

"[Nvidia's] GP-GPU is definitely ahead of AMD Fusion [initiative for bringing together its CPU and GPU capabilities], and Intel's Larrabee won't even come out for years," Lee said.

Appro's Lee said that for businesses willing to do the necessary coding to make their workloads run on GPUs, their cost savings over running CPU-only platforms could be significant.

A key difference between CPUs and GPUs is the number of cores on a piece of silicon, he said. While x86 compute chips can hold up to four cores-with promises of six, eight and 12 down the road-a GP-GPU (general purpose GPU) can have 800 or more cores, Lee said.

For workloads to take advantage of such numbers, they need to be able to be broken up into many pieces, and to have those pieces distributed among the cores. So while the GPU may not run as fast as a CPU, because they are so many more cores, workload can be accomplished more quickly.

Businesses can see improvements in processing performance of 10 or more over CPU-only environments, Lee said, which is important to companies being asked to do some legwork up front.

"There has to be an ROI story to this," he said. "The return needs to be worthwhile [to the business]. Ten times gets to that point. ... That's the part that Nvidia is working hard on."

Appro's HyperPower Cluster can execute thousands of concurrent throughput parallel processing threads for problems that need high mathematical computational capabilities. The cluster includes Appro's high-density servers paired with an equal number of Nvidia Tesla S1070 servers.

The cluster includes interconnect switches for node-to-node communication, a master node and clustering software in a 42U rack configuration.

It supports up to 304 CPU cores and 18,240 GPU cores.

IT managers also can use the Nvidia CUDA toolkit, which enables users to take advantage of the massively parallel architecture. Customers also get a choice of configurations and open-source cluster management software.