Open-Source Projects Target Dispersed Storage Grids, Backup

 
 
By David Morgenstern  |  Posted 2006-08-18
 
 
 

Open-Source Projects Target Dispersed Storage Grids, Backup


SAN FRANCISCO—With a RAID Level 6 demonstration seemingly on display at every corner, enterprise storage is making itself very evident here at the LinuxWorld Conference & Expo. At the same time, several storage-centric open-source community projects (and their commercialized siblings) look to challenge the established order in backup and redundancy.

On the LinuxWorld floor were the Cleversafe Project, a new wide-area, dispersed storage grid that appears to hosts as a mountable drive, and Zmanda, the commercial version of Amanda (Advanced Maryland Automatic Network Disk Archiver), an open-source network backup and archive system.

Cleversafe currently comprises two projects: Cleversafe Dispersed Storage, the storage grid, and the DSGrid File System, which lets the grid present itself as a mountable file system for Linux-based applications.

Cleversafe uses information dispersal algorithms developed for the project that slice data into pieces. Along with the data slices are "coded slices," which contain parity values that can be used to rebuild the entire original piece of data. These sets of slices, called Storage Slices, are dispersed across the Internet in different locations.

When the stored data is called up, the slices are retrieved from the grid. However, not all the slices are needed; a majority of the sets can recreate the data. For example, in an 11-part grid, only six Storage Slices will be needed to recreate the data.

According to project members, the dispersed architecture improves data security, privacy and storage costs. Unlike the usual backup architecture, where entire copies of data are put in backup sets and moved about, the information in the dispersed Cleversafe slices cant be used or understood by themselves. The slicing technology itself provides off-site redundancy as well as some degree of privacy and security.

"With copy-based storage, you have the trade-off that more reliability means less security and more cost. With dispersal, you can engineer your level of reliability and it doesnt increase cost because you dont actually store more data, you just disperse it more," said Chris Gladwin, president of Cleversafe, the Chicago-based company that expects to commercialize the technology as a service.

Storage networking managers are banding together for information and sometimes career survival. To read more about the SNUG movement, click here.

Of course, the performance threshold in this case becomes the speed at which the data can be pulled off the Internet or network. With new higher-speed extensions to TCP/IP on the horizon, that should only improve the potential performance of the Internet storage grid, he said.

Gladwin said that a future version of the software will poll the storage sites at intervals and determine if its faster at any given moment to wait for all slices to come down the pipe or to retrieve fewer slices and rebuild the data using the parity code.

According to Gladwin, the calculation overhead is minimal.

"The IDA is all modular arithmetic, which means additions and subtractions—things that computers do real fast. In other words, the dispersal and recreation of data happens in real time. Its faster than wrapping or unwrapping the packet," he said.

The first test version of the software was released in April. A demonstration grid built using beta software, is currently available for research purposes, Gladwin said. It uses 11 hosting points in North America.

Next Page: Testing is a challenge.

2


Testing the distributed storage system is a challenge, Gladwin admitted, as well as a vital issue to the project and any commercial ventures that will offer services with the grid architecture.

"We spent most of the summer creating the tools to test the storage grid and we may publish that as another project," Gladwin said.

The temperature rose at LinuxWorld when Red Hat accuses Novell of being "irresponsible" about the Xen virtualization technology. Novell baked Xen into its current SLES 10 product. Read more here about the spat.

Meanwhile, on the other side of show was a booth featuring the Amanda Project and Zmanda, its commercial counterpart.

Zmanda executives said its Amanda Enterprise Edition backup software now supports the SUSE Linux Enterprise 10 platform. In addition, the company was made a member of Novells Market Start channel program.

In the booth, the company was showing some forthcoming additions to the software, such as a rewritten GUI, as well as some new directions.

Chander Kant, Zmanda CEO, said the company was in the process of writing modules that customers can use to back up specific open-source applications. The modules will be independent of the Amanda network-based backup.

"Were calling them Zmanda Recovery Managers," Kant said. With the modules, customers will be able to back up an application using other software, such as Veritas NetBackup, across the network. "But all the APIs will be open, all the format of the disk will be open."

Symantec recently released a free version of its storage management application, called Veritas Storage Foundation Basic. Click here to read more.

Kant said Zmanda was targeting the needs of businesses running new Web 2.0 applications.

"Dynamically created content from Web 2.0 appliactions—like that of wikis—is becoming more important, and the big players arent focusing on backing them up. Thats right at the heart of where we are because that data is being generally generated on open-source software using MySQL."

In addition, the company was previewing concepts for the new Zmanda management console, due in the third quarter. It will be Web-based, using AJAX (Asynchronous JavaScript and XML) and Yahoos recently open-sourced libraries, Kant said.

"We want the console to be simple to use, requiring no previous Linux or Unix knowledge," he said. The company has hired an ex-Apple Computer coder to develop the interface.

We dont want to be just commoditizing the backup software because we can do it at low cost. We want to be different. Zmanda is going to be an order of magnitude simpler than Veritas NetBackup. We want it to be self-service. Of course, we have professional services too, but we want it to be simple."

Another growing set of adopters are ISPs, Kant said, who want to offer backup as a service to their customers. A part of the new console design will help these service providers more easily manage these tasks. p>Check out eWEEK.coms for the latest news, reviews and analysis on enterprise and small business storage hardware and software.

Rocket Fuel