The Google Cloud Platform is expanding its capabilities with Hadoop through new connectors that can be used with Google BigQuery and Google Cloud Datastore to run Hadoop queries. Also bolstering the cloud platform is a new version of Google App Engine that includes scalability and performance improvements.
The new Hadoop connectors were announced by Pratul Dublish, a cloud platform product manager, in an April 16 post on the Google Cloud Platform Blog.
“Today, we are making it easier for you to run Hadoop jobs directly against your data in Google BigQuery and Google Cloud Datastore with the Preview release of Google BigQuery connector and Google Cloud Datastore connector for Hadoop,” wrote Dublish. “The Google BigQuery and Google Cloud Datastore connectors implement Hadoop’s InputFormat and OutputFormat interfaces for accessing data. These two connectors complement the existing Google Cloud Storage connector for Hadoop, which implements the Hadoop Distributed File System interface for accessing data in Google Cloud Storage.”
To use the connectors, they can be automatically installed and configured when deploying your Hadoop cluster using the bdutil command and by including some extra code, wrote Dublish.
“These three connectors allow you to directly access data stored in Google Cloud Platform’s storage services from Hadoop and other Big Data open source software that use Hadoop’s IO abstractions,” the post states. “As a result, your valuable data is available simultaneously to multiple Big Data clusters and other services, without duplications. This should dramatically simplify the operational model for your Big Data processing on Google Cloud Platform.”
Several MapReduce code samples are available for inspection to help users make the connections. Sample code is included for using the BigQuery connector, for using the Datastore connector and for using the Datastore connector for reading data and using the BigQuery connector for publishing results, wrote Dublish.
In a related announcement, the latest version of Google App Engine, Version 1.9.3, was unveiled by Google in an April post on the Google Cloud Platform Blog. The new version includes stability and scalability improvements to help users with their core projects, according to Google. “We know that you rely on App Engine for critical applications, and with the significant growth we’ve experienced over the past couple years we wanted to take a step back and spend a few release cycles with a laser focus on the core functionality that impacts your service and end users. As a result, new features and functionality may take a back seat to these improvements.”
Google is often tweaking its Google Cloud Platform and adding new services for users and developers.
Earlier in April, Google announced that it is now making its Google Cloud Platform services available in the Asia Pacific region as it moves to expand the reach of its cloud services to more developers around the world. That means that developers there will have access to Google’s latest cloud technology, including Andromeda—the code name for Google’s network virtualization stack—as well as transparent maintenance with live migration and automatic restart for Compute Engine.
Also earlier in April, Google unveiled new lower pricing for Google Cloud Platform users through “Sustained Use Discounts” that the company made available to users who run large projects on virtual machines. Under the new pricing scheme, users will save more as they use more virtual machines in the Google Cloud.
In March, Google introduced a new Google APIs Client Library for .NET and improved documentation for using third-party Puppet, Chef, Salt and Ansible configuration-management tools, according to an eWEEK report. The new Google APIs Client Library for .NET is an open-source effort, hosted at NuGet, that lets developers building on the Microsoft .NET Framework integrate their desktop or Windows Phone applications with Google’s services. The library includes more than 50 Google APIs for Windows developers.
Also released in March was a new Google paper, “Compute Engine Management with Puppet, Chef, Salt, and Ansible,” which provides information for Google Cloud Platform developers who want to use configuration-management tools such as those from Puppet, Salt, Chef and Ansible.
In October 2013, Google replaced its old Google API Console with a new, expanded and redesigned Google Cloud Console to help developers organize and use the more than 60 APIs offered by Google. The company also released several technical papers to help cloud developers learn more about the development tools it offers through its Google Compute Engine services. The papers, “Overview of Google Compute Engine for Cloud Developers” and “Building High Availability Applications on Google Compute Engine,” offer insights and details about how the platform can be used and developed for business users.