Sun Microsystems and open-source database maker Greenplum, July 26 introduced a data warehouse appliance built from open-source software and general purpose systems that the companies claim is much faster and cheaper than comparable proprietary systems, a Sun executive told eWEEK.
Called simply the Data Warehouse Appliance, the new product is powered by Greenplums own distribution of the open-source PostgreSQL database, Bizgres MPP and Suns open-source Solaris 10 operating system.
It is housed in Suns new “Thumper” Sun Fire X4500 data server, which itself is a powerful (48 disk drive) unit that utilizes dual-core, 64-bit AMD Opteron processors.
“Were calling this the iPod for DW appliances,” Suns CIO for business intelligence and data warehousing Cyrus Golkar told eWEEK.
“It is 10 times faster and 10 times cheaper than other proprietary systems like it on the market, and we think were being conservative when we say that,” Golkar added.
The appliance will be available later this quarter, Golkar said. Initial configurations will deliver usable database capacities of 10, 40 and 100TB. Pricing for the 40TB and 100TB configurations begins at $15,000 per terabyte, and pricing for the 10TB configuration starts at $25,000 per usable terabyte.
Ideal industries for this appliance include telecommunications, financial services, retail and Internet services, a Sun spokesperson said.
“Other more expensive systems have pipes like water hoses, for example,” Golkar said. “This one has a lot more stamina—its more like Niagara Falls.”
As for pricing, Greenplum CEO Scott Yara said there will be basically no comparison to other competitors.
“Right now, its costing companies anywhere from $350,000 to $1 million per usable terabyte of data storage,” Yara told eWEEK.
“Because of the way this appliance is designed and built [using non-proprietary database and operating systems] and because it scales so well [10TB up to 100TB], we have gotten the price down to about $20,000 per usuable terabyte. Actually, some of the configurations start out at less than that.”
Whats under the hood
The Data Warehouse Appliances operating system, Solaris 10, includes Suns new file system, ZFS 1.0. Solaris ZFS is based on a transactional object model that removes most of the traditional constraints associated to I/O operations, resulting in performance gains, Golkar said.
In addition, the appliance offers the following key features, according to Sun and Greenplum:
- Massively parallel processing: This is made possible by 64-bit Opteron processors with Direct Connect Architecture, which use a high-performance interconnect. Sun claims 10 to 50 times faster performance over traditional data warehouse systems in both query and data loading.
- Open source transparency.
- Smaller footprint (up to 50TB per rack) and lower power requirements (4.5kW per rack).
- Integrated, turnkey appliance.
- Improved failover and mirroring capabilities. Dynamically provision additional nodes to scale to hundreds of terabytes if needed.
- Support: Sun includes global support operations.
The appliances software is capable of scanning 1TB of data in 60 seconds and can scale to hundreds of terabytes of usable database capacity. The data warehouse system is also energy efficient, using only 90W per terabyte, Golkar said.
“Greenplums mission is to enable companies to manage massive amounts of information, and to make it all useful. This is the first appliance in the industry that can actually help to make that possible,” said Yara. “Sun is the perfect company to revolutionize the data warehouse appliance market and finally deliver on the promise of business intelligence in the enterprise.”
Baseball Was the Common
Major League Baseball was the common customer that brought Sun and Greenplum together several months ago into the partnership that eventually produced the new appliance, Golkar said.
“When Jonathan [Schwartz, Suns CEO] found out about what Greenplum could do, he blogged that he thought Greenplum was one of the smartest startups hed seen,” Golkar said.
“Then the wheels started turning; we already had Thumper, and we decided to use Greenplums software, so our group starting working overtime and through holidays to get this done.”
Justin Shaffer, senior vice president of new media for MLB, said that his organization needs to “collect data about every single pitch, over the course of 2,430 games each year, [so] the power necessary to analyze and make available that data has become incredibly important.
“The new data warehouse appliance has the potential to open up new ways in which MLB.com can share information with our customers and partners,” Shaffer said.
IT publisher Tim OReilly said that the new appliance can transform the way companies think about handling big data.
“As OReilly continues to harness the collective intelligence of the Web to better understand the technology industry and where its headed, the ability to quickly and cost-effectively analyze large volumes of data is critical,” said Tim OReilly, CEO of OReilly Media.
“In the new era of Web 2.0, data will become the next Intel Inside. Together, Sun and Greenplum are creating technologies based on open source that will help power the next generation of companies and services.”
But will companies trust it?
How hard will it be for mainstream companies to believe such an open source/general standards storage system at such an inexpensive price will be reliable enough to do daily, often mission-critical production work?
“Theres little question in my mind that the specific combination of hardware and software will present certain educational challenges to a potential customer, but there are several factors that will mitigate potential skepticism,” analyst Stephen OGrady of Red Monk in Denver, Colo., told eWEEK.
“First is the open-source pedigree of the database. While it does not have the traction that MySQL does, the Postgres core of the Greenplum software is very well regarded,” OGrady said.
Second, is the almost universally positive reviews that Suns latest x64 entries are getting. Last, the price. Paradoxically, the low cost, which is potentially a concern for some customers, could ultimately be its strongest selling point,” OGrady said.
Greenplum, based in San Mateo, Calif., is only three years old and not a well-known entity, even though PostgreSQL database has built a good reputation. Does that cause a problem of uncertainty among potential enterprise customers?
“Suns reputation is, I think, a positive factor here in that: a) they have a reputation for being able to sell high end systems—a plus when youre competing against Teradata, and b) they have some good momentum and mind share working for them with their new hardware lines,” OGrady said.
“This offering is an intriguing one from a storage market perspective. It represents a collapsing of the server and storage layers, so that the application can be run right on top of the storage.”