As envisioned, Honeycomb will handle large-scale archiving of large collections of static data. The product, which has been under development at Sun for some time, is actually a portfolio of various technologies that work together, including an NAS (network-attached storage) stack and assorted file storage and archiving functions. Together, these technologies make up Honeycomb, a product that will keep better track of where large amounts of data are stored and can search and retrieve files very fast.
The time is right for a product like Honeycomb, said Mike Davis, senior product manager for Honeycomb at Sun, of Santa Clara, Calif.
"Its reflective of a new set of requirements for large-scale file storage," he explained. "Its about understanding how users really want to access their data, and about getting more visibility into these applications as they scale beyond the departmental file sharing application."
To determine how to architect Honeycomb, Sun executives started by asking existing customers how they store their data today. In all cases, customers had some kind of storage system in place—either a tape library, NAS box or series of RAID arrays attached to Linux servers. Each case included a discrete database that stored metadata for the data, allowing users to issue queries to the database to retrieve objects. But to do that, users had to build their own system to search on semantics meaningful to their particular data set.
As a result of those customer interviews, the Sun team also began to realize that conventional file systems like NFS (Network File System) werent working as well anymore, given customers ever-growing data sets and changing requirements.
"The whole concept of the hierarchical file system is losing steam [in large-scale environments]," Davis said. "Its OK for a department doing file sharing on a small NAS box, but when youre talking about storing millions of objects, it becomes very inflexible to try to structure directory paths and file names." Customers have circumvented this problem by creating external databases that keep track of where items are physically stored in the file system, he said.
These discoveries led Sun executives to realize that they should develop a storage system with sophisticated application semantics built into it. Honeycomb does exactly that, allowing users to store metadata with data in the same repository and issue search commands against it. Davis gave the example of medical imaging applications, where data must be stored for a long time very reliably but be available for arbitrary queries. "You can easily ask the system to give you all images for a particular patient or all images of 23-year-old females with lung cancer," he explained.
Although Honeycomb shares some similarities with EMC Corp.s Centera and Hewlett-Packard Co.s StorageWorks RISS (Reference Information Storage System), there are differences, said John McArthur, an analyst at IDC, of Framingham, Mass. Centera, for example, cant search through millions or billions of data sets within seconds, especially at high inquiry volume—something Sun is attempting to do with Honeycomb. And while RISS offers high-speed search capability for large archives, it acquired that capability through acquisition, while Sun has developed it within its own labs, he noted.
EMC, HP and other competitors are likely to sit up and take notice of Honeycomb, McArthur said.
"I would suspect [these capabilities] are in the road map already and its a time-to-market issue for EMC, but it might prompt them to change the current Centera product from an appliance to a set of services," he said.
As far as the final form Honeycomb will take and a firm release data, Davis said Sun is still mulling it over.
"Its not clear whether the first priority for Sun will be to put this out as a discrete product or whether it makes more sense to bring out some of the components first, like the NAS stack," he said. "But we want to be generating some revenue this calendar year."