Google Announces Open-Source Availability of Cloud Dataflow SDK

Google Announces Open-Source Availability of Cloud Dataflow SDK

big data analytics
Dec 19, 2014
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Google is making it easier for software developers to write and integrate applications with its Cloud Dataflow managed service for processing large data sets.

The company on Dec. 18 released a Java software development kit for Cloud Dataflow into the open-source community as part of what it described as an effort to spur application development around the technology.

The idea behind making the SDK available open source is also to help developers port Cloud Dataflow to other languages and other service execution environments, Google software engineer Sam McVeety said in a blog post.

“Reusable programming patterns are a key enabler of developer efficiency,” McVeety wrote. “The Cloud Dataflow SDK introduces a unified model for batch and stream data processing” that developers can take advantage of in innovative new ways, he said.

“We look forward to collaboratively building a system that enables distributed data processing for users from all backgrounds,” McVeety said.

Google announced Cloud Dataflow at the Google I/O conference in June as a managed service to help enterprises ingest and analyze massive data sets both in real time and in batch mode.

The company has described Cloud Dataflow as technology that builds on MapReduce and more recent technologies like Flume and MillWheel, all of which Google has used internally to analyze really massive data stores.

By combining elements of all these technologies, Google hopes to deliver a data processing service that will give companies the flexibility to do batch analysis on large data sets as well as near real-time analysis on data as it streams into the database. It will also let companies ingest and stage data for consumption by other analytics tools and services such as Google’s own BigQuery.

Such capabilities are considered crucial for companies looking to extract business value from big data. The proliferation of cloud services, mobile devices and sensor technologies has allowed businesses to gather increasingly large volumes of data from myriad sources. The challenge has been to find a way to organize and manage the data in a manner as to drive business value from it.

Amazon, one of the biggest cloud service providers, already offers a managed service called Kinesis that is similar to the one that Google plans to launch with Cloud Dataflow. Amazon bills Kinesis as a service for real-time processing of streaming data at massive scale. It is designed as a service to help companies capture, store and analyze terabytes worth of data pulled in from online transactions, Web logs, social media feeds and mobile devices.

With Cloud Dataflow, Google hopes to be able to offer developers and business similar capabilities. “The value of data lies in analysis—and the intelligence one generates from it,” McVeety noted in his blog post.

“Turning data into intelligence can be very challenging as data sets become large and distributed across disparate storage systems. Add to that the increasing demand for real-time analytics, and the barriers to extracting value from data sets become a huge challenge for developers,” he said.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.