Yahoo, Tata Deal Empowers Hadoop Developers (
Page 1 of 2 )
The search giant teams with the Indian company to collaborate on cloud computing research.Yahoo and Computational Research Laboratories, a subsidiary of India-based
Tata Sons, are jointly supporting cloud computing research around the Apache
Hadoop open-source distributed computing project.
As part of the agreement, announced March 24, CRL
will make available to researchers one of the world's top five supercomputers,
which has substantially more processors than any supercomputer currently
available for cloud computing research.
Click Here to Watch the Latest eWEEK Newsbreak Video
Company officials said the deal is a first in terms of the size and scale of
the machine, and the first in making available a supercomputer to academic
institutions in India.
The Yahoo-CRL deal is aimed at leveraging CRL's
expertise in high-performance computing and Yahoo's technical leadership in
Apache Hadoop to enable scientists to perform data-intensive computing research
on a 14,400-processor supercomputer.
Yahoo, MySpace and Google form the OpenSocial Foundation. Read more here.
CRL's supercomputer, known as the EKA,
has 14,400 processors, 28 terabytes of memory, 140TB of disk space, a peak
performance of 180 teraflops (or 180 trillion calculations per second) and
sustained computation capacity of 120 teraflops for the LINPACK benchmark. EKA
is expected to run the latest version of Hadoop and other Yahoo open-source
distributed computing software, such as the Pig parallel programming language
developed by Yahoo Research.
This announcement between Yahoo and CRL
came on the eve of the first-ever Hadoop Summit, scheduled for March 25 at
Yahoo’s Santa Clara, Calif.,
facility.
"We have made our leadership in supporting academic, cloud computing
research very concrete by sharing a 4,000-processor supercomputer with computer
scientists at Carnegie Mellon
University for the last three
months," said Ron Brachman, vice president and head of academic relations
for Yahoo. "With this supercomputing cluster, researchers were able to
analyze hundreds of millions of Web documents and handle two orders of
magnitude more data than they previous could."