Databricks Secures Apache Spark, Launches Community Edition

At the Spark Summit, Databricks announced a new enterprise security framework for Apache Spark and made its data platform generally available.

big data

Databricks, the company founded by the team that created Apache Spark, today announced the completion of the first phase of the Databricks Enterprise Security (DBES) framework.

In making the announcement at the Spark Summit 2016 in San Francisco, Databricks said this move makes it the first company to provide end-to-end enterprise security for Apache Spark.

DBES combines encryption, integrated identity management, role-based access control, data governance and compliance standards to secure Apache Spark workloads.

"ESG research shows the number one attribute sought in evaluating a big data/analytics solution is now security," Nik Rouda, senior analyst at Enterprise Strategy Group (ESG), said in a statement. "As Apache Spark grows rapidly in production environments, satisfying the stringent operational requirements of the enterprise becomes critical. Databricks is accelerating the maturity of their just-in-time data platform built on top of open-source Apache Spark in important ways."

DBES builds on the Databricks access management and encryption functionalities that already exist, Dave Wang, director of product marketing at Databricks, said in a blog post. "With the completion of DBES Phase One today, enterprises gain the ability to control access to Apache Spark clusters on an individual basis, manage user identity with a SAML 2.0 compatible identify management provider service, and end-to-end auditability," he said.

The new security framework provides strong encryption for data at rest and in flight with support for standards such as Secure Sockets Layer (SSL) and keys stored in the AWS Key Management System (KMS). It also is designed to provide integrated identity management and facilitate seamless integration with enterprise identity providers via SAML 2.0 and Active Directory. In addition, DBES provides role-based access control and enables fine-grained management access to every component of the enterprise data infrastructure, including files, clusters, code, application deployments, dashboards and reports.

Regarding data governance, DBES guarantees the ability to monitor and audit all actions taken in every aspect of the enterprise data infrastructure. It also helps with compliance requirements, achieving security compliance that exceeds the high standards of FedRAMP as well as HIPAA (the Health Insurance Portability and Accountability Act) or Sarbanes-Oxley as part of Databricks' ongoing DBES strategy, the company said.

"End-to-end security requirements are top-of-mind for today's enterprises that are building advanced analytics solutions," Ali Ghodsi, CEO at Databricks, said in a statement. "Yet building a truly secure, multi-tenant, and cloud-based enterprise data platform proves to be an impossible undertaking for most. We're delighted to be the first vendor to solve this problem comprehensively for Apache Spark, allowing enterprises to maximize the value from their data without compromising compliance and security."

DBES also features cluster access control lists, single sign-on support and audit logs to monitor usage patterns.

"Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions," Wang said. "With the Databricks Enterprise Security Framework, Databricks can satisfy the diverse (and sometimes competing) needs to secure big data in the modern enterprise, end-to-end. Phase One is only the beginning."