The recent Strata + Hadoop World 2014 conference brought out the luminaries of the big data world, with several announcements around Apache Hadoop and other key big data- and analytics-related technologies.
Business intelligence software provider Tableau Software made a splash by announcing expanded support for Hadoop technologies with new partners including IBM and Amazon Web Services. For instance, Tableau revealed new direct connection capabilities with IBM’s Hadoop-driven InfoSphere BigInsights.
In addition, Tableau announced that it has beta released a direct connector for both Amazon Web Services’ Elastic MapReduce (AWS) as well as Spark SQL, and that it has qualified for Databricks’ “Certified on Spark” program. These developments, combined with Tableau’s recent release of a direct connection to MarkLogic’s Enterprise NoSQL database platform, continue to strengthen Tableau’s position in the big data landscape.
“Tableau is helping to drive the rapidly innovating Hadoop landscape,” said Dan Jewett, vice president of product management for Tableau Software, in a statement. “Our integrations with technology partners in the Hadoop and NoSQL space as well as our efforts to support the Apache open source community all stem from our mission to put the rich visual analytics capabilities of Tableau into the hands of everyone, even those with billions of rows of data.”
Tableau Software’s work to build direct connectors to Amazon Elastic MapReduce and IBM’s InfoSphere BigInsights add to its previous integrations with MapR, Cloudera, Hortonworksand Pivotal as it seeks to provide a wide array of options for customers to unlock the power of their Hadoop deployments.
In all, HDP 2.2 comprises more than 100 new and advanced features that integrate with YARN and enable enterprises to simultaneously utilize batch, interactive and real-time methods to interact with a single set of data stored within Hadoop.
“HDP 2.2 reflects the incredible amount of innovation that has occurred within the Apache Hadoop community in the past six months,” said Tim Hall, vice president of product management at Hortonworks, in a statement. “We listened to our customers, worked tirelessly within the various Apache projects to develop hundreds of new features and remained consistent in delivering all of our technology and product innovations back to the community. As a result, HDP 2.2 brings a tremendous amount of enterprise-ready features to the platform while adhering to enterprise requirements.”
The core new functionalities of HDP 2.2 include new and Improved YARN-Ready Engines such as enterprise-ready Spark on YARN for data science and Apache Kafka for processing data from the “Internet of things.” It also features enterprise SQL at Hadoop scale with Stinger.next, as well as Apache Argus for centralized security administration and policy enforcement. For business continuity there is automated cluster backup to the cloud for Microsoft Azure and Amazon S3.
“For the past few years, Microsoft has worked closely with Hortonworks to contribute back to the Hadoop community and bring Hadoop to Microsoft Azure through HDInsight and Windows Server with HDP,” said T. K. “Ranga” Rengarajan, corporate vice president of Data Platform at Microsoft, in a statement. “The introduction of HDP 2.2 gives customers even more deployment options including the ability to automatically replicate their on-premise data in the Azure cloud or to spin up a HDP cluster as a multi-node virtual machine.”
Big Data Giants Flex at Strata + Hadoop World
An HDP 2.2 preview is available for download here and general availability to customers will be in November 2014.
“Our alliance with Hortonworks is based on our shared belief that open-source innovation is the best approach for enabling Hadoop in the enterprise,” said Greg Kleiman, director of strategy for Storage and Big Data at Red Hat, in a statement. “By tightly integrating Red Hat solutions with the Hortonworks Data Platform 2.2, we can offer our customers speed and agility in building open hybrid cloud solutions for running their new data life cycle on Hadoop.”
In another announcement from the conference, Cloudera and Red Hat launched an alliance to deliver joint enterprise software solutions including data integration and application development tools, and data platforms.
As part of the alliance, the two companies plan to deliver joint solutions to enterprise customers with cooperative documentation, marketing and support, the companies said. Together, Cloudera and Red Hat will help enterprise customers deploy big data solutions that best suit their needs–on premises or in a hybrid or private cloud.
“Through rapid mainstream adoption, Hadoop has become the core of an enterprise data hub that requires a flexible deployment model, robust security, governance and agile development tools to be successful,” said Tim Stevens, vice president of business and corporate development at Cloudera, in a statement. “Our alliance with Red Hat allows for Hadoop workloads to be deployed and managed with the same confidence as other mission-critical workloads to deliver the next wave of big data-based innovation for the enterprise.”
The combined Cloudera and Red Hat solutions offer an open technology stack for enterprises to modernize their traditional data management architecture and deploy Hadoop as the core of their big data infrastructure. The companies plan to work together on cloud-ready data platforms and help enterprises move to the open hybrid cloud with Red Hat Enterprise Linux OpenStack Platform and Sahara integrated with Cloudera Director and Cloudera Enterprise, all managed by Red Hat CloudForms.
They also are working on delivering enterprise-ready data platforms with the integration of Red Hat Enterprise Linux, OpenJDK support and Red Hat Storage Server with Cloudera Enterprise, Cloudera Manager and Cloudera Navigator. And they are delivering data integration and application development tools with Red Hat JBoss middleware and OpenShift by Red Hat integrated with Cloudera Enterprise that leverages the Cloudera Kite libraries, Cloudera Impala and Apache Hive connectors.
“Red Hat believes the data life cycle in the enterprise is changing rapidly and requires an open, agile approach to innovation in the data center,” Scott Musson, vice president of global strategic alliances at Red Hat, in a statement. “Hadoop and OpenStack are key components to this disruption. Red Hat customers are looking for choices in open software solutions for big data and hybrid cloud to help them easily and quickly transform their infrastructure and applications. With this announcement, Red Hat and Cloudera intend to give customers an open and modular technology stack to quickly derive new insights from their data, optimize their existing investment in platform infrastructure and lower the overall cost of managing data platforms.”