Google Pitches Its Dataflow Technology to Apache Software Foundation
Google wants the Apache Software Foundation to take on its Dataflow data processing technology as an incubation project.Google has proposed that its Dataflow technology for writing programs for large-scale data processing jobs be considered for inclusion as an Apache Software Foundation Incubator project. The goal is to foster more collaborative effort and governance around the technology so it can be used to enable the development of data pipelines that are portable across multiple execution engines both on-premises and in the cloud. As part of the proposal, Google wants its Dataflow programming model, Dataflow Software Development Kit and associated "runners" to be bundled under a single ASF incubating project. Supporting Google in its proposal to the Apache Software Foundation are a slew of other technology companies, including PayPal, Cloudera, Talend and Data Artisans. For any code to be considered for inclusion in the Apache Software Foundation, it has to first go through a mandatory incubation period, during which several issues including those pertaining to copyright licenses and future direction are decided. "We believe this proposal is a step towards the ability to define one data pipeline for multiple processing needs, without tradeoffs, which can be run in a number of runtimes, on-premise, in the cloud, or locally," Google Software Engineer Frances Perry and Product Manager James Malone wrote Jan. 20.
Cloud Dataflow, which is Google's managed data processing service based on the technology, will continue as usual and will not be affected by the proposal to move the SDK, programming model and other components to the ASF, the two Google managers said.