Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Cloud
    • Cloud

    Google Updates Cloud Platform, Delivers Dataflow Beta

    Written by

    Darryl K. Taft
    Published April 16, 2015
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Google announced it has updated its cloud platform to make handling big data in the cloud much easier for the everyday programmer or data analyst.

      The announcement includes a series of new services and improvements to existing ones, including a beta release of Google Cloud Dataflow and key enhancements to Google BigQuery.

      Over the last 10 years, Google has created and relied on a lot of the big data innovations in use today.

      “We pride ourselves on being among the key innovators in big data,” Tom Kershaw, director of product management for Google Cloud Platform, told eWEEK. “We created things like MapReduce, Flume and a bunch of technologies to deal with the volumes of data that we see in the Internet world that we just never saw before.”

      The amount of data that organizations are dealing with is exploding though point- of -sale devices, mobile devices, the Internet of Things (IoT), log files and more, and to be able to survive in that world, you really have to be able to harness that data quickly and to transform it into intelligence. There are a lot of tools that allow you to do that, but the problem with those tools is they’re just too complicated, Kershaw said. “They are very difficult to use. Stringing together map reductions can be very hard,” he said. “And for the average startup or the average Java developer or the average data analyst in a large company, these tools have remained out of reach.”

      Enter Google with new solutions to simplify things. Google Cloud Dataflow, now in beta, is a tool that lets you create big data applications using simple programming languages and simple SDKs, Kershaw said. In a blog post from last year’s Google I/O event, Greg DeMichillie, another director of product management for the Google Cloud Platform, said, “Cloud Dataflow is a fully managed service for creating data pipelines that ingest, transform and analyze data in both batch and streaming modes. Cloud Dataflow is a successor to MapReduce, and is based on our internal technologies like Flume and MillWheel.”

      Cloud Dataflow provides unified programming primitives for both batch and stream-based data analysis. The SDK allows the Cloud Dataflow programming model to be widely used, so that developers can benefit from the productivity of writing simple and extensible data processing pipelines which can describe both stream and batch processing tasks.

      “Cloud Dataflow makes it easy for you to get actionable insights from your data while lowering operational costs without the hassles of deploying, maintaining or scaling infrastructure,” DeMichillie said. “You can use Cloud Dataflow for use cases like ETL [Extract, Transform, Load], batch data processing and streaming analytics, and it will automatically optimize, deploy and manage the code and resources required.”

      In a blog post, William Vambenepe, product manager for big data at Google said that nothing stands between you and the satisfaction of seeing your processing logic, applied in streaming or batch mode, via a fully- managed processing service.

      “Just write a program, submit it, and Cloud Dataflow will do the rest,” he said. “No clusters to manage – Cloud Dataflow will start the needed resources, auto-scale them (within the bounds you choose), and terminate them as soon as the work is done.”

      Kershaw said users should view Cloud Dataflow as a Python tool where you can identify data from all kinds of sources, you can specify which data, you can prepare that data and anonymize it or remove the data you don’t care about and then run high scale analytics against that information.

      “Think about it as a Java and Python based toolkit for writing complex database analytics applications really easily,” Kershaw said. “The other thing Dataflow will do that we think is really going to change the game is it allows you to use the same programming language and the same application for both streaming and batch information. Most big data has been batch analytics of historical data such as looking at point of sale data for the month of February for the last five years. What that’s missing is the real-time data that’s current now. Dataflow allows you to do streaming and batch on the same runtime and the same analysis. You can unify historical and real-time information in the same simple program.”

      Google Updates Cloud Platform, Delivers Dataflow Beta

      Kershaw also noted that Google’s goal with Cloud Dataflow was to create an environment where any programmer or any analyst could take the power of big data and be able to transform their business quickly and easily.

      “The intent is if you can do basic Java programming you can now write big data applications,” he said. “A few years ago that was not possible. It was impossible for the average programmer to be able to deal with the complexity of stringing together map reductions.”

      Making big data easier along with the natural advantages of the cloud will drive a transformation in how people approach these problems. In that regard, big data and the cloud are natural bedfellows. Doing big data in the cloud helps organizations be more productive when building applications, with faster and better insights without having to worry about the underlying infrastructure.

      “It’s very difficult to do an on-prem model where you have to buy, set up and run machines to suit the growing needs of your big data environment,” Kershaw said. “So the on-demand compute model and big data just go together hand- in- hand. There’s the operations piece, there’s ability to scale and run different workloads and there’s the issue of security and collaboration and how you can take information and share it across the organization. We think the cloud collaboration model and the new tools we’re delivering to make big data easy are going to change the game in how organizations use big data. So it’s no longer just the realm of the data scientist; it’s really going to be accessible to any developer anywhere at any time.”

      Meanwhile, Google also updated BigQuery.

      BigQuery is a large-scale analytics engine that allows you to run through massive volumes of data and do it with a SQL front end. It is Google’s flagship product for being able to integrate large scale analytics with off –the- shelf business tools. Google also announced the availability of BigQuery in Europe so users can store their data in Google Cloud Platform European data centers and support for data residency so users can specify which continent they want their data to be stored in and Google will make sure it stays there.

      Google also enhanced BigQuery’s ingestion capability, so it can now ingest 100,000 rows per second per table. And the company introduced row-level permissions, a new security feature that helps with how you store information.

      “BigQuery is the ideal platform for storing, analyzing, and sharing structured data,” Vambenepe said. “It also supports repeated records and querying inside JSON objects for loosely structured data.”

      Meanwhile, Google Cloud Pub/Sub is designed to provide scalable, reliable, and fast event delivery as a fully managed service, he said.

      “Along with BigQuery streaming ingestion and Dataflow stream processing, it completes the platform’s end-to-end support for low-latency data processing,” Vambenepe added. “Whether you’re processing customer actions, application logs, or IoT events, Google Cloud Platform allows you to process them in real time, the cloud way. Leave Google in charge of all the scaling and administration tasks so you can focus on what needs to happen, not how.”

      Darryl K. Taft
      Darryl K. Taft
      Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×