ClusterHQ Brings Docker Virtualization to Data Storage

ClusterHQ's Flocker leverages the ZFS file system to tackle the container storage challenge.

ClusterHQ Flocker

There has been a lot of momentum surrounding the growth of the open-source Docker container virtualization technology, but few vendors have directly addressed the issue of data storage. ClusterHQ is now building a technology called Flocker to solve the Docker data challenge.

Flocker is the result of more than five years of research, development and operational experience combining resilient, distributed, commodity storage with containers, Luke Marsden, CEO and co-founder of ClusterHQ, said.

"Two years ago, we launched a product called HybridCluster, an advanced container management platform based on FreeBSD Jails," Marsden told eWEEK. "Production customer deployments of HybridCluster have proven the resilience and scalability of our technology, and this experience has enabled our company, which we’ve rebranded as ClusterHQ, to move extremely quickly to solve the data problem for Docker and Linux."

Docker emerged just over a year ago as a Linux-based solution for containers, expanding on underlying technologies like LXC (Linux Containers). The open-source FreeBSD operating system has long had its own container construct known as Jails.

Flocker is a lightweight volume and container manager that developers can use to define their application as a set of connected Docker containers, Marsden said. Those applications can then be deployed and easily migrated with their data between hosts, even when in different data centers.

While Flocker is a data technology, Marsden emphasized that Flocker is categorically not a storage-area network (SAN) or network-attached storage (NAS) technology.

"There has been very little prior to Flocker that addresses the problem of data for containers," Marsden said. "In fact, lack of support for data-backed services is one of the major things that is holding up container adoption."

Flocker uses distributed, local storage to provide storage for applications while delivering storage portability, Marasden said. A key part of Flocker is its use of Zettabyte File System (ZFS) replication technology. ZFS is a file system originally developed by Sun Microsystems for use in the Solaris Unix operating system. OpenZFS, which is what Flocker leverages, is the open-source successor of the ZFS project.

"Flocker uses ZFS on Linux, which is part of the OpenZFS project, which we are actively involved in and contribute to," Marsden said. "Previously with HybridCluster, we have been building on top of ZFS on FreeBSD, which is natively supported and integrated there."

Flocker uses a combination OpenZFS, along with proprietary technology that ClusterHQ has been developing over the last five years to manage ZFS file systems in a distributed cluster.

"Flocker 0.1 only just scrapes the surface of what's possible with our distributed file system technology," Marsden said.

The portability of data volumes is just the first step for Flocker. Beyond that, Marsden expects that it will be possible to do continuous replication , live migration of containers, and VMware DRS (Distributed Resource Scheduler)-like features, such as automatically balancing the load in a cluster by automatically moving containers around.

As the Docker ecosystem matures, multiple efforts have emerged to build orchestration capabilities, including Google Kubernetes, Apache Mesos and CoreOS Fleet. Marsden said that Flocker can be used to complement orchestration tools.

"We created Flocker because none of the orchestration systems provide operational support [e.g., data migrations, failover] for stateful containers, and yet data is at the heart of every application," Marsden said. "We hope to be able to extend these orchestration services to provide support for databases, queues, key-value stores and other data-backed services."

Flocker is now just at its 0.1 release, and there is a lot more development work to get done. Marsden said that he's very interested in getting feedback from the community on what features they’d like to see in a general availability release to make sure that all of those features are solid and production-ready at scale.

In terms of making a business out of Flocker, that's still on the future road map.

"While we are not announcing commercial offerings today, you can expect ClusterHQ to adopt a business model similar to other successful open-source companies," Marsden said.

Sean Michael Kerner is a senior editor at eWEEK and Follow him on Twitter @TechJournalist.

Sean Michael Kerner

Sean Michael Kerner

Sean Michael Kerner is an Internet consultant, strategist, and contributor to several leading IT business web sites.