Cloudera, a leading provider of Hadoop-based data management
software and services, has announced the third version of its Cloudera
Distribution for Hadoop (CDH).
Cloudera announced the new version at the Hadoop Summit
in Santa Clara, Calif., on June 29. Cloudera’s Distribution for Hadoop
version 3 is the most comprehensive Hadoop-based data management
platform on the market.
In an interview with eWEEK, Mike Olson, CEO of Cloudera said
Cloudera’s Distribution for Hadoop v3 consists of core Apache Hadoop
and eight additional open source projects, all tested and integrated
into a platform that is easy to install and use. Cloudera’s
Distribution lowers the bar for Hadoop adoption and usage in the
enterprise, he said.
“Cloudera has gained deep experience in the market working with
customers to deploy Hadoop in their organizations and has learned how
to use Hadoop effectively,” said Doug Cutting, creator of Apache Hadoop
and architect at Cloudera, in a statement. “CDH v3 is our response. It
includes the most appropriate enterprise-grade add-on projects that
enhance the core Apache Hadoop framework and make it easier for any
organization to use.”
“The Cloudera Distribution for Hadoop is quickly gaining momentum
because it provides a stable foundation for enterprises to collect,
store and analyze large amounts of data,” said Tom Leonard, executive
vice president of business development at Pentaho, in a statement. “The
Pentaho BI Suite is a perfect complement and we are excited to partner
with Cloudera to make it easier for organizations of all sizes to
integrate additional data sources and enable a wider population of
users to realize value via analysis, reporting and dashboards – either
on premise or via the cloud.”
Cloudera is also announcing the creation of two new open source
projects as part of Cloudera’s Distribution for Hadoop. The company is
releasing Flume, its data loading infrastructure, and its Hadoop User
Environment (HUE) code under the Apache V2 open source license. These
additions simplify data acquisition and make it much easier to build
attractive user interfaces for Hadoop applications.
“As organizations increasingly struggle to extract value from an
ever expanding sea of data, more and more of them are turning to
Hadoop,” said Stephen O'Grady, an analyst with RedMonk, in a statement.
“Cloudera's new offerings lower the barrier to entry for enterprises
looking to deploy Hadoop in production environments.”
“We’ve been working with customers to help them use Hadoop to solve
various problems,” Olson said. “Hadoop on its own is not enough to
tackle the big data analysis problems and other problems they face.”
Thus, the eight additional projects address important requirements
organizations have which ease the adoption of Hadoop, Olson said.
Additional projects in CDH v3 include Hive, HBase, Sqoop, Oozie, Flume,
Zookeeper, Pig, and Hue.
These projects address deployment requirements in the area of data
integration, workflow, scheduling, high-level languages, serialization
UI, fast read/write and remote procedure call (RPC). All of these
components are selected because they dramatically simplify Hadoop
deployment. All are integrated and tested together at scale, Olson said.
“Cloudera's Distribution for Hadoop provides Apollo Group with the
key functionality we need to take full advantage of Hadoop for
analyzing our academic data,” said Satish Menon, senior vice president
and head of Apollo’s Silicon Valley R&D Center. “With Cloudera's
distribution we can get to critical insights faster.”
“We teamed up with Cloudera because we’re both committed to making
Hadoop accessible and easy to use,” said Martin Hall, CEO at
Karmasphere. “Cloudera Enterprise takes the pain out of operating
large, complex Hadoop clusters and simplifies the entire process with
its new data management tools.”
“Greenplum Chorus is a next generation data collaboration platform
which is well complemented by the Cloudera Distribution for Hadoop,”
said Scott Yara, President of Greenplum, in a statement. “We are
increasingly seeing customers deploy the Cloudera Distribution for
Hadoop alongside Greenplum products for data staging, processing and
MapReduce analytics. Cloudera's expansion in the scope of what defines
a Hadoop platform is exciting and better enables Hadoop users
everywhere.”