Typesafe Launches Support for Apache Spark

Scala creator Typesafe announced support for Apache Spark, including architecture and design review, on-site training and 24/7 production SLAs.

agile development

Typesafe, creator of the Scala language and the company behind the Play Framework and the Akka toolkit, announced the launch of full lifecycle support for Apache Spark, the popular big data processing engine.

Spark is quickly gathering broad industry support from enterprises and the world's leading IT vendors, including IBM, as the preferred in-memory approach to handling large-scale data processing workloads. Typesafe announced its commercial support for Spark at the Spark Summit Europe in Amsterdam this week.

In a recent survey, Databricks, co-founded by Matei Zaharia, creator of Apache Spark, found that 71 percent of Spark adopters also use Scala, making it Spark's most popular programming language pairing. Scala is an object-oriented, Java-influenced programming language. The language's name, Scala, is an acronym for "scalable language."

Spark supports multiple programming languages, but developers more frequently chose Scala because of its scalability on the Java Virtual Machine (JVM) and Spark itself is written in Scala. Spark developers using Scala cite the ability to get deeper into Spark's source code, and more immediate access to Spark's newest features, as other major advantages for the language choice.

As a commercial support option, Typesafe offers expertise to Spark developers creating big data projects in Scala. Typesafe was founded by Martin Odersky, the creator of the Scala programming language, and Jonas Bonér, the creator of Akka middleware. Both Scala and Akka are popular in the world of "Fast" Data, which is what some are calling the next wave of computation engines that rely on the speed of data processing and the ability to process event streams in real-time. And Typesafe's support team offers deep experience in using Spark and Scala with the range of complementary technologies, such as Apache Mesos, Apache Kafka, Apache Cassandra, Hadoop and more, the company said.

"Being successful in any Spark project, whether it's architecture, code reviews, best practices or production support, calls for the best expertise in the world," Jamie Allen, senior director of global services at Typesafe, said in a statement. "Typesafe can help you build a truly Reactive big data solution so you have confidence you are designing, building and deploying things the right way and taking advantage of the full capabilities of Spark and related technologies."

Reactive programming is a programming paradigm oriented around data flows and the propagation of change. This means that it should be possible to express static or dynamic data flows with ease in the programming languages used, and that the underlying execution model will automatically propagate changes through the data flow.

One of the few obstacles to Spark enterprise adoption has been a developer skills gap, Databricks officials said. Developers new to Spark can find it difficult to configure and scale Spark across clusters of machines, whether in the framework's "client" or "cluster" mode. And where Java has been the language of choice for the Hadoop community to date, the de facto language for Spark is Scala, which is built around functional programming concepts like immutable data and function composition.

The launch of commercial support for Spark is the latest effort by Typesafe to support enterprises building Fast Data pipelines. In September, Typesafe contributed critical backpressure support to Spark 1.5 that throttles data rates to enhance operational stability for long-lived production streaming workloads. In June, Typesafe and Mesosphere launched a new distribution of Apache Spark optimized for deployment on the Mesosphere Datacenter Operating System (DCOS), to simplify deploying Spark to any modern version of Linux or any major cloud provider. Earlier this summer, Typesafe also announced new releases of Akka Streams and Slick to bring easier composability, failure handling and database access to Scala developers working with streaming data.

The need to process big data faster has fueled intense developer interest in Spark as an alternative to MapReduce. Apache Spark speeds up big data processing by a factor of 10 to 100 and simplifies app development. Spark only recently—in 2014—became a top-level Apache project, but has achieved rapid adoption. In a survey earlier this year, Typesafe polled more than 2,100 developers globally and found that 13 percent were already using Spark in production, another 20 percent were planning production usage in 2015 and 31 percent are actively evaluating it.

"Commercial support from Typesafe is a great safety net for companies like ours that are aggressively adopting Fast Data technologies for our real-time data processing and machine learning efforts," said Patrick DiLoreto, head of research and development at William Hill, one of the United Kingdom's largest online gaming companies, and a Typesafe support customer.

In addition to commercial support offerings covering the spectrum of pilot to production- level Spark development scenarios, Typesafe offers an introductory workshop for Spark developers. The workshop focuses on how to use the Spark Scala APIs, understanding Spark internals and how they affect performance, test and deployment scenarios, and other practical considerations.