Little MapR Joins Big EMC in New Hadoop Distribution

By Chris Preimesberger  |  Posted 2011-05-25 Print this article Print

Big Data specialist will become a key part of EMC's Greenplum HD Enterprise Edition, a 100 percent interface-compatible implementation of the Apache Hadoop software stack.

Little MapR Technologies on May 25 revealed a new software licensing agreement with data storage and security giant EMC to add its intellectual property to EMC's new Apache Hadoop analytics distribution.

San Jose, Calif.-based MapR will become a key part of EMC's Greenplum HD Enterprise Edition, a 100 percent interface-compatible implementation of the Apache Hadoop software stack. The new appliance will use MapR Technologies' clustering IP for the pre-integrated and tested distribution.

Apache Hadoop, created by former Apple, Xerox PARC and Yahoo developer Doug Cutting, is an open-source software framework built in Java that works with distributed data-intensive applications. It enables applications to scale securely in order to handle thousands of nodes and petabytes of data.

Although a number of Hadoop distributions are available, they don't all deal with issues such as single points of failure, lack of snapshots and mirroring, and poor performance -- which is what MapR brings to the table.

Map R's Feature Set

CEO John Schroeder gave eWEEK an overview of MapR's feature set. It includes:

  • NFS direct access, which allows users to use the NFS protocol to simply load and access data directly in a Hadoop cluster and enables standard tools and utilities to work directly on data contained in Hadoop.
  • Heatmap user interface to provide full cluster visibility and control.
  • All single points of failure are eliminated in the Hadoop stack.
  • JobTracker High Availability ensures continuous job execution.
  • Distributed NameNode with High Availability addresses major reliability issue while also improving performance and scale.
  • Snapshots allow point-in-time data protection and recovery.
  • Mirroring for business continuity includes wide area replication support.
"This is a major advancement for Hadoop users everywhere. MapR's innovations coupled with EMC's big data analytics capabilities and service will allow more people to use the power of big data analytics and enable substantial market growth," said John Webster, Senior Analyst at Evaluator Group.

"MapR has managed to innovate on performance, cost reduction, dependability and ease-of-use all at once. This marks a major shift for the Hadoop market."

Hadoop Inspired by Google's MapReduce

Cutting, now at Cloudera and serving as the chairman of the Apache Software Foundation, has said that Hadoop was inspired by Google's MapReduce (which handles clustering of a system's nodes) and Google File System. MapR is the commercial implementation of the open source MapReduce.

Hadoop, which is named after Cutting's son's toy elephant, is being maintained and improved by a large global community of contributors. Yahoo, one of the first movers in Hadoop and which now sponsors a Hadoop developers' conference, has been the largest contributor to the project and uses Hadoop extensively across its own businesses.

"Hadoop has played a leading role in the transformation from traditional data warehousing to big data analytics," Webster said. "EMC's Hadoop commercialization strategy is aimed at streamlining and bulletproofing Hadoop for enterprise users, making Hadoop more of a must-have real-time analytics tool for the enterprise."

Chris Preimesberger Chris Preimesberger was named Editor-in-Chief of Features & Analysis at eWEEK in November 2011. Previously he served eWEEK as Senior Writer, covering a range of IT sectors that include data center systems, cloud computing, storage, virtualization, green IT, e-discovery and IT governance. His blog, Storage Station, is considered a go-to information source. Chris won a national Folio Award for magazine writing in November 2011 for a cover story on and CEO-founder Marc Benioff, and he has served as a judge for the SIIA Codie Awards since 2005. In previous IT journalism, Chris was a founding editor of both IT Manager's Journal and and was managing editor of Software Development magazine. His diverse resume also includes: sportswriter for the Los Angeles Daily News, covering NCAA and NBA basketball, television critic for the Palo Alto Times Tribune, and Sports Information Director at Stanford University. He has served as a correspondent for The Associated Press, covering Stanford and NCAA tournament basketball, since 1983. He has covered a number of major events, including the 1984 Democratic National Convention, a Presidential press conference at the White House in 1993, the Emmy Awards (three times), two Rose Bowls, the Fiesta Bowl, several NCAA men's and women's basketball tournaments, a Formula One Grand Prix auto race, a heavyweight boxing championship bout (Ali vs. Spinks, 1978), and the 1985 Super Bowl. A 1975 graduate of Pepperdine University in Malibu, Calif., Chris has won more than a dozen regional and national awards for his work. He and his wife, Rebecca, have four children and reside in Redwood City, Calif.Follow on Twitter: editingwhiz

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel