EMC's answer to the Oracle Exadata and IBM Netezza servers is the Greenplum Data Computing Appliance.
As amazing as it seems, EMC, the world's
largest independent data storage and protection company, really didn't have a
competitive data warehousing choice in its product portfolio until it bought
Greenplum in July.
That part of the market, which historically has belonged to Teradata-currently
about 70 percent-and, to a lesser extent, Netezza (bought
by IBM on Sept. 20 for $1.7 billion), somehow has eluded EMC all these
years.
Following the Greenplum buy, EMC also partnered
with Cloudera to handle some other "big data" business, and appeared
to be set in the DW world for at least a few years.
Still, EMC had no new DW products until now. Filling this gap, EMC on Oct. 13
launched its generically named Greenplum Data Computing Appliance, a data
warehousing system that integrates with a data center and will serve as a
marketplace answer to the Oracle Exadata and IBM Netezza DW servers.
The hardware in each Greenplum rack includes 16 commodity servers, 192 Intel
cores and Ethernet connectivity. These are high-transactional machines that
handle huge workloads, and they are not inexpensive by any
means. Each rack is said to cost at least $1 million per unit, with Exadata
apparently being the highest-priced at $1.5 million.
The Greenplum-developed appliance will serve as the first product of the new
EMC Data Computing Products Division, which is led by former Greenplum CEO and
co-founder Luke Lonergan.
Lonergan described Greenplum as EMC's new "key enabler of 'big data' cloud
systems," which include self-service health care and financial and scientific
analytics.
"This will allow organizations to store, manage and closely analyze
terabytes of detailed data for faster business insight, conclusions and
revelations," Lonergan told eWEEK.
The Greenplum appliance has significance beyond the release of a new high-end
data center product, Lonergan said.
"This really marks a stepping out of EMC and VMware into data computing,"
Lonergan said. "It's not just about storing the data; it's about using the
data. That's really what's been behind Greenplum selling our data warehouse
since 2006."
The Greenplum Data Computing Appliance, which runs the parallel Greenplum
Database 4.0, has been tested at a data-loading performance of 10TB an hour.
This is twice as fast as Oracle Exadata and five times faster than the best
systems from Netezza and Teradata, Lonergan said.
"Exadata just doesn't work," Lonergan said.
There are three main things that set Greenplum apart from Exadata and Netezza, Lonergan
said.
"These are: scalability from one rack to 24 racks with one call to EMC-and
it will do that while everything [is] online," Lonergan said. "That
would be from 36TB in one rack to low single-digit petabytes, uncompressed.
This all scales online, that's a key.
"Secondly, it uses FCOE [Fibre Channel over Ethernet], a converged
networking stack [with 10 Gigabit Ethernet] with 16 FC connections from each
rack that can be used to connect into your existing SAN [storage area network].
This enhances the appliance for high-availability.
"The third piece of this is that it's private-cloud ready-it's
virtualization-capable," he said. "It snaps into existing VMware deployments."
Lonergan said Greenplum has already shipped seven of these systems and expects
to ship several hundred more of them in the following quarters.
"This fits into the appliance category, but at the same time it leverages
all the existing investments that the EMC customers in their core storage area
network," Lonergan said.
The EMC Greenplum Data Computing Appliance is available
in flexible half-rack [eight boxes], full-rack and multiple-rack appliance
configurations for terabyte- to petabyte-scale requirements. It is natively integrated with
EMC's replication, backup and recovery and deduplication software.
Is EMC moving into the HPC world?
"The Greenplum Database software stores a large amount of structured data in
[a] format that allows queries and other access methods to complete much faster
than if this 'big data' was stored in a traditional relational database,"
Enterprise Strategy Group analyst Brian Babineau told eWEEK.
"This is not an HPC play for EMC; it is a horizontal market opportunity
that spans multiple industries where large data warehouses and business
intelligence systems support critical operations. Oracle is targeting this
market with Exadata, and IBM acquired Netezza for similar reasons-these warehouses
are getting so big that response times are not satisfactory, and each vendor is
trying to solve it with an integrated system," Babineau said.
"The different is that the integrated systems are built using different
components. EMC is using a unique, new database approach. Oracle is shifting
some of its software intelligence to the hardware. IBM/Netezza built an
integrated system that changes where the analytics are actually executed."
Babineau said in the context of "databases" customers usually view
EMC as a premier storage systems supplier.
"EMC always had a system configuration that could address any size
database with specific performance requirements," he said. "Now, EMC
owns the database software that can help address the performance and operational
challenges that customers have today. The DCA is significant because it
combines the new database, Greenplum, with the storage into a single, yet
extremely scalable system.
"And that storage is not your traditional EMC storage (i.e. Symmetrix or Clariion)-it
is server-attached disk [storage] that is all managed and optimized by the
Greenplum database. The bottom line is EMC is finding new ways to solve
database performance issues outside of selling more of the same."
Greenplum Database 4.0 is now shipping as a licensed software-only product for
deployment on industry-standard x86 hardware and integrated infrastructure
packages, such as the Virtual
Computing Environment coalition's Vblock cloud infrastructure packages.
Vblocks are preintegrated, preconfigured computing systems consisting of
networkware from Cisco Systems, storage, security and system management from
EMC and virtualization software from VMware. The resulting cloud computing
systems will range in size from hundreds of virtual machines to more than 6,000 virtual machines,
depending upon the needs of the customer.
Chris Preimesberger was named Editor-in-Chief of Features & Analysis at eWEEK in November 2011. Previously he served eWEEK as Senior Writer, covering a range of IT sectors that include data center systems, cloud computing, storage, virtualization, green IT, e-discovery and IT governance. His blog, Storage Station, is considered a go-to information source. Chris won a national Folio Award for magazine writing in November 2011 for a cover story on Salesforce.com and CEO-founder Marc Benioff, and he has served as a judge for the SIIA Codie Awards since 2005. In previous IT journalism, Chris was a founding editor of both IT Manager's Journal and DevX.com and was managing editor of Software Development magazine. His diverse resume also includes: sportswriter for the Los Angeles Daily News, covering NCAA and NBA basketball, television critic for the Palo Alto Times Tribune, and Sports Information Director at Stanford University. He has served as a correspondent for The Associated Press, covering Stanford and NCAA tournament basketball, since 1983. He has covered a number of major events, including the 1984 Democratic National Convention, a Presidential press conference at the White House in 1993, the Emmy Awards (three times), two Rose Bowls, the Fiesta Bowl, several NCAA men's and women's basketball tournaments, a Formula One Grand Prix auto race, a heavyweight boxing championship bout (Ali vs. Spinks, 1978), and the 1985 Super Bowl. A 1975 graduate of Pepperdine University in Malibu, Calif., Chris has won more than a dozen regional and national awards for his work. He and his wife, Rebecca, have four children and reside in Redwood City, Calif.Follow on Twitter: editingwhiz