VMware has introduced Spring Hadoop, a new project that will make it easier for enterprise Java developers to use the familiar Spring Framework to build solutions around the Apache Hadoop platform.
Spring Hadoop is the latest addition to the Spring Data family of projects and integrates the Spring Framework and the Apache Hadoop platform. Spring Hadoop provides support for writing Apache Hadoop applications that benefit from the features of Spring, Spring Batch and Spring Integration. VMware introduced the Spring Hadoop project on Feb. 29 at the O'Reilly Strata Conference in Santa Clara, Calif.
"VMware is committed to helping developers build, deploy, manage and scale the new wave of data-driven applications," said Adrian Colyer, CTO for Cloud and Application Services at VMware, in a statement. "By building upon Spring's strong and versatile foundation of simplifying data access, and leveraging the depth of the Hadoop platform, VMware is delivering a streamlined programming model that makes Spring the natural way to integrate Hadoop systems into the enterprise application landscape."
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers.
Enterprises interested in Hadoop have noted the need for tools to make dealing with Hadoop easier for developers as well as business users. The VMware move with Spring Hadoop is aimed at helping with the former, particularly for enterprise Java developers.
Spring Hadoop brings the benefits of Springsimplicity and ease of useto Hadoop by providing a comprehensive, lightweight framework that will allow developers to easily build solutions around the Hadoop platform, VMware officials said. As data volumes and data access choices in enterprise applications have grown exponentially, Spring continues to focus on enabling enterprise Java developers to incorporate new data access patterns into their applications through the Spring Data projects.
In a Feb. 29 blog post, Costin Leau, a staff engineer in the SpringSource unit of VMware, said:
""Part of the Spring Data umbrella, Spring Hadoop provides support for developing applications based on Hadoop technologies by leveraging the capabilities of the Spring ecosystem. Whether one is writing stand-alone, vanilla MapReduce applications, interacting with data from multiple data stores across the enterprise, or coordinating a complex workflow of HDFS, Pig, or Hive jobs, or anything in between, Spring Hadoop stays true to the Spring philosophy offering a simplified programming model and addresses 'accidental complexity' caused by the infrastructure. Spring Hadoop provides a powerful tool in the developer arsenal for dealing with big data volumes.""
Spring Hadoop is free to download and available now under the open-source Apache 2.0 license. As indicated by Leau, key aspects of Spring Hadoop include:
- Support for configuration, creation and execution of MapReduce, Streaming, Hive, Pig and Cascading jobs via the Spring container
- Comprehensive HDFS data access support through JVM scripting languages (Groovy, JRuby, Jython, Rhino, etc.)
- Declarative configuration support for HBase
- Dedicated Spring Batch support for developing powerful workflow solutions incorporating HDFS operations and all types of Hadoop jobs
- Support for use with Spring Integration that provides easy access to a wide range of existing systems using an extensible event-driven pipes and filters architecture
- Powerful Hadoop configuration options and templating mechanism for client connections to Hadoop
- Declarative and programmatic support for Hadoop Tools, including FsShell and DistCp