Just when you think there's no more innovation on the horizon in the data storage sector, somebody comes up with a new take on the standard idea of storing, protecting and restoring digital data.
Datos IO, which eschews conventional snapshots in lieu of a rapid-fire, almost "live-streaming" report on the state of a Cassandra database, emerged from stealth on Sept. 15 and announced that it has raised $15.25 million in venture funding. The round was led by Lightspeed Venture Partners and True Ventures, along with a number of angel investors.
New large-volume, high-ingestion, real-time data applications that are much faster and move much more data than legacy apps are being deployed on scale-out databases, such as Cassandra, MongoDB, HBase, Amazon DynamoDB, Google BigTable and Spark. These databases enable rapid development of next-generation applications, but they also lack enterprise-scale recovery capabilities that allow corrupted data to be removed, replayed and propagated with minimal downtime to customer-facing applications.
This is where Datos IO comes in. The company's platform is centered on a distributed versioning platform for scale-out databases that enables application architects and DevOps staff to extract value from their data and metadata via advanced data management services using their standard applications and others, CEO Tarun Thakur told eWEEK.
"In scale-out IT systems, every node has a different view of the data," Thakur said. "Cassandra doesn't run on EMC storage, or on shared storage; it runs on locally attached storage. If you remember 20 years ago, we had DAS—direct-attached storage. We have now come full circle with big data, where compute and storage are coming together in hyper-convergeance.
"Why do I need stand-alone shared storage on the site, when my database doesn't really need it?"
In the new data-centric world, data recovery is also evolving. CIOs and chief security officers—even chief marketing officers—are instituting new business requirements around the value of data, so that it can be analyzed quickly and efficiently to help make business decisions. Application and database architects are coming up with new requirements for cluster-wide consistency and near-zero recovery windows to satisfy these business requirements. This new converged system model includes new buyers of data recovery—such as DevOps folks—and the agile deployment models of private and public clouds, Thakur said.
Datos IO is just starting out and won't have a product ready for prime-time until sometime in 2016. But it appears that the market is ripe for improved data storage for scale-out systems, which are only going to ingest more and more data, thanks to the advent of the Internet of things.
According to a report by Markets and Markets, the cloud database and database-as-a-service market is projected to be a $14.05 billion market by fiscal year 2019, with a compound annual growth rate of 67.30 percent during the forecast period of 2014 to 2019.
Datos IO, located in San Jose, Calif., was founded by Thakur, a former Data Domain (now EMC), Veritas and IBM Research veteran, and Dr. Prasenjit Sarkar, CTO, who spent more than 15 years at IBM Research.
The company's advisors include B.J. Jenkins, CEO of Barracuda Networks; Matt Pfeil, chief customer officer at DataStax; Debashis Saha, vice president of eBay Cloud Services; and Dr. Remzi Arpaci-Dusseau, professor of computer science, University of Wisconsin-Madison. The team also includes five members from IBM Almaden Research Center, five Ph.D.s and senior technical architects from Google, Netflix, Data Domain (EMC), CommVault, NetApp and Oracle.