LAS VEGAS—EMC has always been able to store lots and lots of files and data. But when it came to the processing of enterprise big data workloads, the company—which didn’t make servers—knew a few years ago that it didn’t have the new-gen capabilities to handle an anticipated wave of new big data use cases.
To little fanfare, it bought a small San Francisco-based startup called Pivotal in March 2012, and following some corporate investment and direction, EMC soon had the foundation for a whole new business: the processing, distribution and storage of large data sets using a full-fledged set of tools in a cloud service.
The acquisition has been very successful. Pivotal’s tools and services are now used by more than 350,000 developers globally.
At EMC World 2015 on May 5, Pivotal added to its signature Big Data Suite by introducing a faster implementation of the Pivotal Greenplum Database and a new implementation of the Apache Spark framework that runs on top of the Pivotal distribution of Hadoop.
Pivotal Big Data Suite is aimed at providing users with better stability, management, security, monitoring, and data processing capabilities in the Hadoop stack. This allows enterprises to offload more high-priority workloads to Hadoop, to store and process large volumes of data at lower costs, and do it in compliance with policies and regulations.
The company also launched something called the Pivotal Query Optimizer, a cost-based query optimizer for both the Pivotal Greenplum Database and HAWQ, the SQL engine that Pivotal created to run on top of Hadoop. These improvements are designed to help customers manage growing data sets driven by mobile, cloud, social, and the Internet of things, and to answer complex queries fast across these data sets.
Pivotal Big Data Suite also offers the first version of Pivotal HD based on an Open Data Platform core and includes major updates to Apache Hadoop components, including Apache Spark.