DataStax, which provides a big data platform built on Apache Cassandra, has announced DataStax Community Edition 1.2.
DataStax Community Edition 1.2 is based on the latest version of Apache Cassandra, Cassandra 1.2—the massively scalable open-source NoSQL database that was recently released. The new release comes with a free edition of DataStax OpsCenter, a visual management and monitoring solution for Cassandra that enables developers to manage and monitor their Cassandra clusters in point-and-click fashion from any location and from nearly any device.
The Apache Software Foundation (ASF) announced Apache Cassandra 1.2 on January 2. Apache Cassandra is used by some of the biggest and busiest organizations on the Web, such as eBay and Twitter. Users say Cassandra powers massive data sets quickly and reliably without compromising performance, whether running in the cloud or partially on-premise in a hybrid data store.
DataStax Community Edition 1.2 contains all of Apache Cassandra 1.2’s features, including virtual nodes, Cassandra Query Language 3 (CQL3) improvements, request tracing, atomic batches and configurable policies. Virtual nodes improve the granularity of capacity increases, improve repair and rebuild times in larger clusters and automatically keep the data in clusters balanced across all nodes. CQL3 improvements include the addition of collection types, query-able system information and a CQL-native protocol.
Request tracing allows developers and administrators to trace CQL requests on an individual or collective basis and easily understand what statements in a cluster are causing performance problems. Atomic batches ensure multiple statements. Atomic batches ensure that multiple statements sent to a cluster in batch are always applied in an all-or-nothing basis. And configurable policies are for disk failure.
In addition, DataStax Community Edition 1.2 also features many performance improvements to memory usage, column indexes, compaction, streaming, startup time and more.
“Cassandra 1.2 represents one of the most significant releases of the database yet,” said Jonathan Ellis, CTO of DataStax and project chair of Apache Cassandra, in a statement. “This new release makes it much easier for developers to insert and manipulate data, while CQL3 1.2 allows them to model naturally. We also worked to improve Cassandra deployments on denser hardware, with features like virtual nodes (vnodes), disk failure policies and compaction performance.”
Also in a statement, Christos Kalantzis, engineering manager of cloud persistence engineering at Netflix, said, “Virtual nodes are the most exciting feature for us. The fact that we can autoscale our Cassandra clusters up and down on EC2 has already made a huge impact on our productivity.”
In addition to vnodes, the second generation of Cassandra features atomic batches, inter-node communication and request tracing enhancements, which simplify the process of setting up new clusters and enable a higher level of cluster performance. CQL3, the third version of the Cassandra Query Language, enables simplified application modeling, more powerful mapping and a more natural representation of data that diminishes design limitations. This helps improve scalability and reliability.
“C* 1.2 is huge, but it’s hard to really put a finger on an overall theme because the improvements are so wide-ranging,” said Rick Branson, software engineer, Instagram. “The solidification of CQL3 and addition of atomic batches make huge leaps forward in terms of developer productivity. On the operations side, the addition of vnodes and off-heap internals—compression metadata & bloom filters—make managing a cluster much simpler and more hands-off.”
Independent Apache Cassandra committers also weighed in about the merits of the new version. Apache Cassandra committer Aaron Morton said, “There is something in Apache Cassandra 1.2 for everybody. Virtual Nodes and CQL 3 will make it easier for new users to set up a cluster and get productive. Existing users will see their clusters doing more, thanks to the performance improvements, while everyone will benefit from the insights that request tracing brings.”
Apache Cassandra is successfully used by an array of organizations that include Adobe, Appscale, Appssavvy, Backupify, Cisco, Clearspring, Cloudtalk, Constant Contact, DataStax, Digg, Digital River, Disney, eBay, Easou, Formspring, Hailo, Hobsons, IBM, Mahalo.com, Morningstar, Netflix, Openwave, OpenX, Palantir, PBS, Plaxo, Rackspace, Reddit, RockYou, Shazam, SimpleGeo, Spotify, Thomson-Reuters, Twitter, Urban Airship, the US Government, Walmart Labs, Williams-Sonoma, Inc. and Yakaz.