Why pay millions for new big-iron storage and related networking software when youve already got it? From individual power users to enterprises, todays wasted disk space can be harnessed to tomorrows dynamically scaling storage networks, retiring the appliance and monolithic approaches for all but high-end and specialty needs, researchers say.
Topologies for such a network might include a decentralized file sharing program that is more akin to the Nullsoft Inc.-founded Gnutella network than to Napster Inc.s original peer-to-peer file sharing service. They also might look like an in-house version of the more-serious-sounding grid-computing notion.
Whatever the look and feel, the possibilities of P2P storage are attracting the interest of everyone from government-funded university researchers to startups such as PeerStor Inc. to corporate giants such as Hewlett-Packard Co., EMC Corp., IBM and Microsoft Corp.
“For certain direct-attached architectures, I could leverage underutilized storage assets on other servers without having to develop a SAN [storage area network]” that was initially used for replication and backup, said analyst Tony Prigmore, of Enterprise Storage Group Inc., in Milford, Mass. “Theres no reason to believe that for certain small-to-[midsize] enterprises, peer storage couldnt be successful.”
The concept begins in laboratories such as Randolph Wangs, a computer science professor at Princeton University. Wangs graduate students are working on a project called PersonalRAID to ensure data availability for mobile professionals. Its an alternative approach to just carrying all data locally; instead of worrying about constant synchronizing and acquiring the latest storage gadgets such as IBMs Microdrive, such storage could be automatically embedded with Internet connections or via Bluetooth or 802.11 wireless transmissions, said Wang, in Princeton, N.J. When a device needs data, it asks the users PersonalRAID to find it, first checking the local storage, then nearby devices via Bluetooth, the local network with 802.11 and finally wide-area resources through the Internet. The query method replaces a central data table, he said.
“We basically have a pretty real prototype, [but] at this stage, we dont have any relationship with any company. We just hacked it up on our own,” Wang said. The patent-pending technology is years from commercialization. The next step is to make it transparent. “From the users perspective, its a single-disk drive,” Wang said.
A more imminent application of P2P storage is for midsize companies. PeerStor is a New York startup taking the ad hoc, miniature-SAN approach. “What our product allows a user to do is install the software and set up the drive location anywhere on the WAN or LAN to … perform automatic failover in real time with open files,” said Joe Pennino, CEO of PeerStor. Only the data changes will be mirrored, but to multiple locations, he said.
The software will cost about $150 per node and will work with a companys networked storage and unused server space, Pennino said. The technology was first developed in DOS 10 years ago but was never made into a product. Now, “wed like to have this out in the next month or two,” he said.
It will scale to “anybody but the Fortune 1000,” Pennino said.
Large-enterprise implementation of P2P storage is the furthest away, as stated in various papers presented at the FAST Conference on File and Storage Technologies, held earlier this year, and at last months related Usenix Technical Conference, both in Monterey, Calif. A project called Cooperative Backup System, or CBS, is similar to PeerStors approach, but its Internet-based. CBS uses a technology called Reed-Solomon erasure codes, invented at the Massachusetts Institute of Technologys Lincoln Laboratory in 1960, to rebuild data from its parts, even if some parts are destroyed or missing. Because it works best across multiple sites, CBS is most cost-effective for individual users and for very large enterprises, said lead researcher Mark Lillibridge, of HPs Systems Research Center, in Palo Alto, Calif.
While the CBS project lacks speed—it could take two weeks to restore data from it, Lillibridge said—an area it excels at is security. “You challenge your partners occasionally to make sure they still have your data. If they dont, then you drop their data in retaliation,” after a short buffer period in case the partner machine has crashed, he said.
Its unclear when CBS could become a product, though its been a prototype since the summer of 2000. “I havent talked to the HP product folks, so I dont know what the story is now. … It might become an open-source project,” Lillibridge said. “Its been on hold due to various chaos with the merger [of HP and Compaq Computer Corp.],” he said.
With related file system research being funded by companies such as IBM and Microsoft, P2P storage, especially for backup and recovery purposes, could become a significant technology before the decades out, said Peter Christy, an analyst and co-founder of NetsEdge Research Group, in Los Altos, Calif. “The Internet makes data a lot more valuable because you can get to it anywhere, any time,” Christy said. “Will people be using random disk drives all over their company for the storage of important business assets? At the moment I think the answer is no, [but] theres a lot of room for creativity for making that happen.”
- Interview: Brown Bets on Diverse Offerings
- Scaling Toward the Petabyte
- Softek Lays Out Plan for Storage Space