DataStax Launches Scalable Real-Time Enterprise Graph Database

The DataStax Enterprise Graph scalable, real-time graph database powers cloud apps that manage complex and highly connected data.

cloud startup

NoSQL database provider DataStax has announced its entry into the graph database space with DataStax Enterprise (DSE) Graph.

Martin Van Ryswyk, the company’s executive vice president of engineering, described DataStax Enterprise Graph as a scale-out graph database built for cloud applications that need to manage highly connected data.

The new offering is built on the foundation of Apache Cassandra and Apache TinkerPop, the open source graph computing framework. It also features the Gremlin graph query language and DataStax security, search capabilities and Apache Spark integration. It will be available in the second quarter of this year.

A graph database is used for storing, managing and querying complex and highly connected data, Van Ryswyk said. Moreover, the graph database architecture is particularly well suited for exploring data to find commonalities and anomalies among large data volumes and unlocking the value contained in the data’s relationships. In addition to that, however, DSE Graph delivers continuous uptime, performance and scalability for modern systems dealing with complex and constantly changing data.

DataStax Enterprise Graph was inspired by the open-source Titan graph database. DataStax acquired Aurelius, the team behind Titan, last year and the team has built a new set of software that extends beyond the basic capabilities of Titan while still maintaining backwards compatibility.

“What we’ve done is re-written Titan based on all the things we heard where people wanted improvements and fixed all those things,” Van Ryswyk told eWEEK. “We think we’ve covered all of the deficits in Titan with DSE Graph, so we’re going to compete very well against that.”

Moreover, by maintaining backwards compatibility, DataStax is enabling Titan users and other users of TinkerPop-supported graph databases to migrate to DSE Graph with little or no effort. DSE Graph inherits Cassandra’s key benefits including predictable low-latency response times and operational maturity. DSE Graph also incorporates enterprise-class extensions found in DataStax Enterprise including advanced security, built-in analytics, enterprise search, visual management monitoring and development tooling, Van Ryswyk said.

“We said we need to integrate what we’re doing with graph deep into our current platform and provide a different model,” he noted.

By sticking to its Cassandra-based knitting, Van Ryswyk claims DSE Graph is the first graph database fast enough to power customer facing applications capable of scaling to massive datasets and integrated advanced tools that power deep analytical queries. That Cassandra core also enables DSE Graph to scale to billions of objects, spanning hundreds of machines across multiple datacenters with no single point of failure, he said.

That gives the new DataStax graph database offering a leg up not only on relational database solutions, but also on other graph databases, he said.

“The biggest player out there is Neo4j,” Van Ryswyk said. “It’s a good system, but their architecture is a scale-up architecture. It takes advantage of lots of memory. If there’s any scale out, it’s master slave-ish. It’s not going to do 1,000 nodes like we test every release of DataStax Enterprise. There are folks out there running Cassandra on 100,000 nodes.”

Graph databases are best applied to use cases where there are lots of connected devices or assets with multiple relationships between and among them.

“One of the things we need to do is educate the market to what graph technology can do for them,” said Matthias Broecheler, director of engineering for DSE Graph and lead developer of Titan. “Most people start out with a relational database and try to scale it, but then they fail with that approach because it’s just too cumbersome or too restrictive or too slow or too difficult to implement. That’s when they look around for something else that can address their problem more productively.”

DataStax officials said ideal use cases for graph databases include: master data management, recommendation and personalization, security and fraud detection, and the Internet of Things (IoT) and networking.

Large industrial companies that have to manage large numbers of assets find graph technology to vastly superior to relational technology for managing all the relationships, Broecheler said.

“And what is true for physical asset management also is true for digital asset management, such as dealing with a vast number of digital documents or movies or other types of media you’re trying to store, there is a lot of metadata around that media and there are a lot of relationships you need to manage,” he said. “So digital and physical asset management together with IoT is one major use case.”

DataStax Enterprise Graph consists of DataStax Enterprise Server, DataStax OpsCenter, the DataStax Studio developer environment and DataStax Drivers for popular development languages as well as the Gremlin graph language.

Van Ryswyk said DataStax customers have been asking the company to meet their multi-data model needs by providing support for key-value, tabular, JSON/document, and graph data models. DSE Graph addresses the graph data model.

“Having graph as part of the DSE platforms enables us to now not only serve the lower and middle parts of today’s data model continuum where data complexity and relationships are concerned, but also the highest end so we can support the parts of a cloud application that need to manage complex and highly connected data,” said Robin Schumacher, vice president of products at DataStax, in a blog post on the new offering.