Big Data Giants Flex at Strata + Hadoop World

Leaders in the big data arena announced new products, programs and partnerships at the recent Strata + Hadoop World 2014 conference in New York.

Strata + Hadoop World 2014

The recent Strata + Hadoop World 2014 conference brought out the luminaries of the big data world, with several announcements around Apache Hadoop and other key big data- and analytics-related technologies.

Business intelligence software provider Tableau Software made a splash by announcing expanded support for Hadoop technologies with new partners including IBM and Amazon Web Services. For instance, Tableau revealed new direct connection capabilities with IBM’s Hadoop-driven InfoSphere BigInsights.

In addition, Tableau announced that it has beta released a direct connector for both Amazon Web Services’ Elastic MapReduce (AWS) as well as Spark SQL, and that it has qualified for Databricks’Certified on Spark” program. These developments, combined with Tableau’s recent release of a direct connection to MarkLogic’s Enterprise NoSQL database platform, continue to strengthen Tableau’s position in the big data landscape.

“Tableau is helping to drive the rapidly innovating Hadoop landscape,” said Dan Jewett, vice president of product management for Tableau Software, in a statement. “Our integrations with technology partners in the Hadoop and NoSQL space as well as our efforts to support the Apache open source community all stem from our mission to put the rich visual analytics capabilities of Tableau into the hands of everyone, even those with billions of rows of data.”

Tableau Software’s work to build direct connectors to Amazon Elastic MapReduce and IBM’s InfoSphere BigInsights add to its previous integrations with MapR, Cloudera, Hortonworksand Pivotal as it seeks to provide a wide array of options for customers to unlock the power of their Hadoop deployments.

Meanwhile, Hortonworks announced version 2.2 of its Hortonworks Data Platform (HDP), an enterprise-ready data platform with Hadoop YARN as the architectural center.

In all, HDP 2.2 comprises more than 100 new and advanced features that integrate with YARN and enable enterprises to simultaneously utilize batch, interactive and real-time methods to interact with a single set of data stored within Hadoop.

“HDP 2.2 reflects the incredible amount of innovation that has occurred within the Apache Hadoop community in the past six months,” said Tim Hall, vice president of product management at Hortonworks, in a statement. “We listened to our customers, worked tirelessly within the various Apache projects to develop hundreds of new features and remained consistent in delivering all of our technology and product innovations back to the community. As a result, HDP 2.2 brings a tremendous amount of enterprise-ready features to the platform while adhering to enterprise requirements.”

The core new functionalities of HDP 2.2 include new and Improved YARN-Ready Engines such as enterprise-ready Spark on YARN for data science and Apache Kafka for processing data from the “Internet of things.” It also features enterprise SQL at Hadoop scale with, as well as Apache Argus for centralized security administration and policy enforcement. For business continuity there is automated cluster backup to the cloud for Microsoft Azure and Amazon S3.

“For the past few years, Microsoft has worked closely with Hortonworks to contribute back to the Hadoop community and bring Hadoop to Microsoft Azure through HDInsight and Windows Server with HDP,” said T. K. “Ranga” Rengarajan, corporate vice president of Data Platform at Microsoft, in a statement. “The introduction of HDP 2.2 gives customers even more deployment options including the ability to automatically replicate their on-premise data in the Azure cloud or to spin up a HDP cluster as a multi-node virtual machine.”