MapR Technologies, which provides a popular Apache Hadoop distribution, announced version 5.0 of its MapR Distribution.
Introduced at the Hadoop Summit, the latest MapR release auto-synchronizes storage, database and search indices to support real-time applications. MapR 5.0 also includes comprehensive security auditing, Apache Drill support, and the latest Hadoop 2.7 and YARN features.
“With the newest release of the MapR Distribution, we continue to lead the market in delivering reliable and real-time Hadoop to the enterprise,” said Anil Gadre, senior vice president of product management at MapR, in a statement. “We help enable the ‘as-it-happens’ business where organizations can shorten their data-to-action cycle. Our product is deployed at customer sites and industries that are highly regulated due to their use of sensitive data, which proves that MapR is architected for enterprise-grade security requirements.”
Jack Norris, chief marketing officer at MapR, said MapR 5.0 is architected for processing big and fast data on a single data platform that enables a new class of real-time applications. Enterprises are increasingly deploying multiple applications on a single Hadoop cluster, he said. In fact, 18 percent of MapR customers are deploying over 50 separate applications on a single cluster.
“Designed as a large-scale batch data analysis system, Hadoop is not often associated with operational analytics or transaction processing,” said Carl W. Olofson, research vice president, data management software research at IDC, in a statement. “Hadoop adds tremendous value for decision management at the strategic and operational levels, but still is emerging as a framework for making tactical decisions ‘in the moment.’ With Hadoop innovations—such as those in MapR 5.0–happening every day, enterprises should consider using Hadoop as a ‘Decision Data Platform’ that functions as a single platform for handling both live operational data and real-time analytics.”
The MapR Distribution including Hadoop version 5.0 extends the MapR Real-time, Reliable Data Transport framework to deliver and synchronize data in real time to external compute engines. The first supported external compute engine is Elasticsearch to enable synchronized full-text search indexes automatically without writing custom code, MapR said.
“Customers want search indexes automatically synchronized with the latest data updates,” said Jobi George, global partner director at Elastic, the company behind Elasticsearch, in a statement. “The MapR architecture makes this easier for application developers who need to let their end users search for data almost immediately after it is updated.”
The new release also enhances MapR’s data governance and security, with comprehensive auditing for all data accesses via log files in JSON format. This enables extensive reporting and validation and quick analysis with Apache Drill. This also adds to the trusted security capabilities MapR already provides for authentication and authorization.
With the new Drill Views feature, organizations have field-level access controls on unstructured data files. In addition, analysts gain agile data governance by sharing data sets with custom access permissions via Views and eliminating the need to involve IT intervention for access control. Also, the comprehensive auditing capabilities in MapR 5.0 let organizations log user activity, which is particularly important for understanding user behavior as well as for achieving regulatory compliance. These new security and data governance advancements lay the foundation for value-adding partner technologies.
MapR 5.0 Makes Its Debut at Hadoop Summit
“Agility should be a key requirement in any big data governance strategy,” said Chris Twogood, vice president of product and services marketing at Teradata, in a statement. “Teradata Loom avoids an IT bottleneck by providing Hadoop end users agile mechanisms to find, understand, prepare, secure, and manage data throughout its lifecycle. We see a similar opportunity for agility in the authorization features in ‘Drill Views,’ which empower analysts to quickly share specific elements of their data sets with other analysts.”
Version 5.0 of the MapR Distribution will be available in 30 days.
MapR partners, including Centrify, Dataguise, Datameer, HP Security Voltage, Informatica, Protegrity, Syncsort, Talend, Teradata, Waterline Data and Zaloni, have embraced the new MapR Distribution version 5.0 to implement big data solutions.
“These new data-centric features from MapR couldn’t be more timely, as enterprises are leveraging Hadoop beyond batch analytics and driving new applications that drive up the consumption of personal and private data in Hadoop, but must do so in a protected and audit-able way,” said Jeremy Stieglitz, vice president of products at Dataguise, in a statement. “In particular, MapR expanded security for authorization and auditing capabilities in 5.0 fits perfectly with Dataguise support for sensitive data discovery, encryption, masking, and monitoring.”
Andrew Brust, senior director of technical product marketing and evangelism at Datameer, said the new MapR security and data governance capabilities align well with Datameer’s own governance capabilities, announced just last week.
“The MapR 5.0 innovations for field-level access control and auditing are great complements to our secure data views and auditing capabilities,” Brust said. “With our other governance features, including data lineage, impact analysis and data profiling, along with our Native Hadoop architecture, the Datameer and MapR combination has never been stronger.”
Meanwhile, MapR also announced a new software module to accelerate the provisioning and deployment of big data solutions. The new MapR Auto-Provisioning Templates apply software-defined concepts that will enable organizations to quickly deploy a cluster, Norris said.
Moreover, the MapR Auto-Provisioning Templates provide organizations with flexibility to deploy purpose-built big data solutions on their hardware infrastructure of choice, whether it be directly on hardware servers from a variety of vendors, a virtualized private cloud or a public cloud provider. The Auto-Provisioning Templates provide the simplicity of appliances, yet also support the hardware diversity that production Hadoop clusters typically require. The Auto-Provisioning Templates also let customers expand their deployment at increments they define and need, rather than at the homogeneous “stair-step” increments that a rack-based appliance requires, Norris said.
“The MapR Auto-Provisioning Templates leverage software-defined concepts to create a new kind of appliance for modern, real-time Hadoop applications,” Gadre said. “Customers are asking for software-defined abilities to address many of their business and IT objectives. We’ve simplified that process and enable organizations to choose the components they want to meet their big data infrastructure needs.”
Auto-Provisioning Templates define the software, network and hardware attributes of a single node, as well as support diverse definitions required across many nodes. Auto-Provisioning Templates support the deployment of data lakes, data exploration and operational analytics.
Users deploy MapR Auto-Provisioning Templates via the MapR Installer.