Google Boosts Data Science Presence With Kaggle Purchase

The data science platform used by the world's largest community of data scientists and machine learning engineers is now part of Google.

data science

Google has made no secret of its plans to leverage data science and machine learning approaches to make its products and cloud services smarter and more intuitive to use.

This week, the company expanded its presence in the emerging fields through the acquisition of Kaggle, a platform used by an estimated 800,000 data scientists and machine learning enthusiasts for analyzing public data sets and building machine learning models.

Numerous organizations use the platform to crowd source machine learning tasks via online programming contests that are open to almost anyone with an interest in the space.  

Among the contests listed currently on the platform is one with $1 million in rewards for those who come up with the most innovative ways to apply data science to improving the detection of lung cancer. Another offers $100,000 in rewards for developing machine algorithms for video classification purposes on Google Cloud and YouTube.

Google announced the Kaggle acquisition at its Google Cloud Next '17 conference in San Francisco this week. The company did not disclose terms of the deal or when the acquisition was completed.

Fei-Fei Li, chief scientist of Google's cloud AI and machine learning group, described the acquisition as a major step towards making AI tools and best practices available to a broader community of enthusiasts and practitioners. "With Kaggle joining the Google Cloud team, we can accelerate this mission," Li said in a blog post this week.

Google will continue Kaggle's mission around machine learning training and deployment services while giving the community the ability to store and query large data sets, she noted.

Mikhail Naumov, co-founder of DigitalGenius, a company that applies AI approaches to customer service said Google's purchase of Kaggle is confirmation of the increasing importance of data scientists and deep learning engineers.

"Getting access to this talented community is a major advantage, as the engineers make decisions around which platforms and frameworks to use when building their products," Naumov said in a statement. "Clearly, adoption of Google's TensorFlow and Google Cloud Platform will grow as a result of this acquisition," he said.

Jeong-Yoon Lee, chief data scientist at cloud analytics services company Conversion Logic, described Google’s purchase as giving the company better access to the largest data science community in the world.

"Many at Kaggle are either at school or at early stage in their career. If Google Cloud can lock them in their cloud platform, the return will be huge," Lee said.

In addition to the Kaggle announcement, Google used Next '17 to also introduce a new machine learning application programming interface dubbed Video Intelligence API.

The API is designed to make videos searchable on a frame-by-frame basis and can be used to annotate videos stored in Google's Cloud Storage systems. The API lets organizations search for specific items across their entire video catalog and to aggregate search results by video and specific frame location, according to a Google description of the technology.

There are several uses cases for the Video Intelligence API, according to the company. Media organizations with a lot of archived video content and little metadata on it can use the API to get a better sense of the content in their archives and find new ways to monetize it.

Similarly, the new API will give digital content publishing providers a way to improve video content recommendation and ranking for customers, Google said.

Google this week also updated its Vision API for image recognition and announced a new lab for machine learning training and education.

Jaikumar Vijayan

Jaikumar Vijayan

Vijayan is an award-winning independent journalist and tech content creation specialist covering data security and privacy, business intelligence, big data and data analytics.