EMC to Acquire Data Warehouse Analytics Provider Greenplum

 
 
By Chris Preimesberger  |  Posted 2010-07-06 Email Print this article Print
 
 
 
 
 
 
 

UPDATED: Data storage specialist EMC plans to acquire Greenplum, maker of massively parallel processing SG Streaming technology designed to eliminate the bottlenecks associated with other approaches to data loading.

Storage giant EMC revealed July 6 that it is acquiring privately held Greenplum, which provides next-generation database warehousing software and self-service, cloud-based analytics for enterprises.

Terms of the all-cash transaction were not disclosed, but EMC did say it expects the deal to close in September.

Greenplum, the main market of which is enterprises with large amounts of data to store in cloud deployments, will form the foundation of a new data computing product division within EMC's Information Infrastructure business, Chuck Hollis, an EMC vice president and the global marketing CTO, told eWEEK.

Greenplum's MPP (massively parallel processing) SG Streaming (Scatter/Gather Streaming) "secret sauce" is designed to eliminate the bottlenecks associated with other approaches to data loading. The company follows a parallel-everywhere approach to loading, in which data flows from one or more source systems to every node of the database.

Greenplum's software is capable of delivering 10 to 100 times the performance of traditional database software, EMC said. Data-driven businesses that include NASDAQ OMX, NYSE Euronext, Skype, Equifax, T-Mobile and Fox Interactive Media currently use Greenplum for its cloud-based high-performance data analytics service.

Greenplum is different from traditional bulk loading technologies used by most mainstream database and MPP appliance vendors that push data from a single source, often over a single or small number of parallel channels. The aforementioned situation can-and often does-result in bottlenecks and lengthier load times.

"There's always a bottleneck in those data warehouses, whether it's in the database, the servers, or the storage," analyst Brian Babineau of Enterprise Strategy Group told eWEEK. "Everybody tries to solve those bottlenecks in a different way. And it's easy to blame the storage, because disk drives tend to be the slowest part of the bottleneck.

"The reality is that EMC does not want to give up that business [storage and database optimization software] to the likes of Oracle or other folks because it's just a storage player. Now they have Greenplum, ideally suited for x86 environments, and which distributes workloads very well among shared storage resources."

Greenplum, which works only on x86 open systems and utilizes the open source PostgreSQL database, fits right into EMC's overall "big data" plans, Babineau said.

"The second angle here is that EMC has deployments on the back ends of a lot of data warehousing systems," Babineau said.

Greenplum has challenged established vendors such as Oracle, Teradata and Netezza, and has been successful in only seven years of existence.

"The data warehousing world is about to change," said Pat Gelsinger, president and chief operating officer of EMC Information Infrastructure Products. "Greenplum's massively parallel, scale-out architecture, along with its self-service consumption model, has enabled it to separate itself from the incumbent players and emerge as the leader in this industry shift toward 'big data' analytics."

In acquiring Greenplum, EMC saw an opportunity for the storage market to evolve, EMC's Hollis told eWEEK.

"Put it all together: big data, billions of records, the new mandate to make real-time analytics a weapon, the advent of fully virtualized environments, self-serve analytics and people who are good knowledge workers," Hollis said. "This is not about doing what was done previously, better. This is about entirely new use cases for big data. We're betting on the future rather than trying to monetize the past."

'Good synergy' developed over time

The two companies kept running into each other in various deployments during the last two years or so, and eventually a good synergy developed, Greenplum co-founder and President Scott Yara told eWEEK.

"The alignment was so close in a number of ways: in terms of how we viewed the importance of data, the idea of moving processing closer to where the data lives, and the role that virtualization and private cloud computing is going to play in data analytics," Yara said. "The idea came that maybe we should join forces. We decided that it was either going to happen very quickly, or that we would just keep going, because it was going very well."

Greenplum employs about 140 people in the San Francisco Bay Area.

"We believe so much in this idea [of moving processing and data closer together for performance efficiency], that Greenplum will be the nucleus of a whole new EMC products group," Hollis said. "Much like the way Data Domain [2009] and RSA [2006] came in, when we built entire product divisions around them, we're going to ask the Greenplum leadership team to do the exact same thing for us."

Babineau said 2010 could be a breakout year in data warehousing.

"This is a very interesting space," Babineau said. "The two biggest companies in it, Teradata and Netezza, are totaling about $2 billion in trailing 12-month revenue ... Teradata about $1.7 billion and Netezza about $203 million.

"There is clearly a lot of money being spent in this area, and EMC wants its fair share of this stuff."

Editor's Note: eWEEK Senior Writer Brian Prince contributed to this report.


 
 
 
 
Chris Preimesberger Chris Preimesberger was named Editor-in-Chief of Features & Analysis at eWEEK in November 2011. Previously he served eWEEK as Senior Writer, covering a range of IT sectors that include data center systems, cloud computing, storage, virtualization, green IT, e-discovery and IT governance. His blog, Storage Station, is considered a go-to information source. Chris won a national Folio Award for magazine writing in November 2011 for a cover story on Salesforce.com and CEO-founder Marc Benioff, and he has served as a judge for the SIIA Codie Awards since 2005. In previous IT journalism, Chris was a founding editor of both IT Manager's Journal and DevX.com and was managing editor of Software Development magazine. His diverse resume also includes: sportswriter for the Los Angeles Daily News, covering NCAA and NBA basketball, television critic for the Palo Alto Times Tribune, and Sports Information Director at Stanford University. He has served as a correspondent for The Associated Press, covering Stanford and NCAA tournament basketball, since 1983. He has covered a number of major events, including the 1984 Democratic National Convention, a Presidential press conference at the White House in 1993, the Emmy Awards (three times), two Rose Bowls, the Fiesta Bowl, several NCAA men's and women's basketball tournaments, a Formula One Grand Prix auto race, a heavyweight boxing championship bout (Ali vs. Spinks, 1978), and the 1985 Super Bowl. A 1975 graduate of Pepperdine University in Malibu, Calif., Chris has won more than a dozen regional and national awards for his work. He and his wife, Rebecca, have four children and reside in Redwood City, Calif.Follow on Twitter: editingwhiz
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Rocket Fuel