EMC is partnering with Cloudera to help organizations manage and analyze big data.
The two companies plan to integrate the data warehousing and analytics technology EMC acquired when it bought Greenplum with Cloudera Distribution for Hadoop (CDH). CDH is used for collecting, consolidating and analyzing data. The partnership will integrate CDH with EMC's Greenplum massively parallel processing database to "provide a robust architecture for collaborative analysis of large amounts of structured and unstructured data," EMC announced.
According to Cloudera, the connector between the two products will be supported by both Greenplum and Cloudera and will enable high-speed bidirectional data transfers between systems.
In addition, data staged by CDH will be integrated with EMC Greenplum Chorus platform.
"Customers can use Cloudera's Distribution for Hadoop to inexpensively stage complex and structured data, while Greenplum Chorus utilizes its cloud-based platform to extract data from a variety of sources and enables collaborative analysis for many users," said Michael Olson, CEO of Cloudera, in a statement.
Cloudera recently announced a similar partnership with Teradata to connect CDH with the Teradata data warehouse.
"Teradata customers are using Apache Hadoop as a method of processing primarily high volumes of unstructured data," said Scott Gnau, chief development officer at Teradata, in a statement Sept. 15 announcing the deal. "It wasn't long before my phone rang and those customers wanted to integrate these new insights into their data warehouses. Teradata Labs has been working on this, and our efforts led us to Cloudera."
Bringing EMC and Cloudera solutions together creates a powerful tool for collaborative data analysis, Bill Cook, senior vice president and general manager of EMC's Data Computing Products Division, said in a statement.
"EMC and Cloudera represent a powerful combination of what we can deliver to customers," he said. "By bringing together our solutions, our customers have a powerful tool for collaborative data analysis and can more quickly and effectively analyze data from a variety of sources."