Google has begun applying knowledge from its work in deep neural network (DNN) technology to improve the quality of thumbnails on YouTube.
The goal is to see whether recent DNN advances in areas like image and video processing can be applied to improve Google’s automatic YouTube thumbnail generator so people can find videos more easily.
Weilong Yang and Min-hsuan Tsai, members of Google’s video content analysis team and the YouTube Creator team, respectively, described the company’s efforts involving DNN and thumbnails in a Google Research blog post on Oct. 9.
The blog offers a look at some of the fascinating work going on behind the scenes at Google to improve the quality of thumbnails—something that people use heavily when viewing video content on YouTube.
As the researchers note in the blog, judging the quality of videos can be very subjective, especially when choosing frames for inclusion in a thumbnail.
To teach its automatic thumbnail generator to discern between good quality video frames and those of poor of quality, Google researchers compiled a collection of what they considered high quality thumbnails uploaded by users on YouTube.
The characteristics they looked for included how well framed the thumbnails were, whether the subjects were in proper focus and whether the thumbnails were properly centered on a specific subject. They then complied a similar collection of poor quality video with a set of common characteristics that made them so. The high quality videos were classified as positive examples while the poor quality videos were classified as negative examples using a binary classification method for visual quality modeling.
The DNN system that is used to automatically generate the thumbnails was then “trained” to recognize the high quality videos from the poor-quality ones using the data set.
Using the approach, video that is uploaded to YouTube is sampled at one frame per second and evaluated using the visual quality model. Each frame is then assigned a quality score. The frames with the highest scores are selected and enhanced before being rendered into YouTube thumbnails, Ming and Tsai said.
“Compared to the previous automatically generated thumbnails, the DNN-powered model is able to select frames with much better quality,” the two researchers noted in the blog post.
Individuals, who were asked to evaluate the quality of YouTube thumbnails, consistently preferred the ones generated by the DNN system, compared to the previous thumbnailer. In side-by-side comparisons, individuals chose the new thumbnails in 65 percent of the cases.
The results are important because strong thumbnails help people find content more easily on YouTube, the researchers said. “Better thumbnails lead to more clicks and views for video creators.”
What was left unsaid is that more views for video creators also means more opportunities for Google to put revenue-generating ads in front of users.
Google’s work with DNNs is part of a broader Large Scale Deep Learning initiative designed to help the company build more intelligent computing systems. The goal is to equip computers of the future with speech and vision capabilities, language understanding and even user behavior prediction by applying advanced machine-learning approaches to really massive data sets.