SAN JOSE, Calif.—Nvidia’s overarching theme throughout the GPU Technology Conference here this week has been deep learning, the idea that with the right technology and right algorithms, machines can learn from their experience, and adapt their behavior.
During his keynote address March 17, Nvidia co-founder and CEO Jen-Hsun Huang folded everything he announced—from a new high-powered GPU to software and hardware tools for researchers and scientists to the detail he gave about the upcoming new Pascal architecture—into the message that they will be leveraged to advance the research and development of deep-learning neural networks.
“The topic of deep learning is probably as exciting an issue as any in this industry,” Huang said during his talk.
It was against this backdrop that Jeff Dean, a senior fellow in Google’s Knowledge Group, took the stage at GTC March 18 to talk about the search giant’s extensive work in deep learning—which is also known as machine learning—over the past several years. Google, with its massive stores of data on everything from search queries to Street View projects, seems like a company that naturally would be interested in the field.
In a fast-paced hour-long keynote, Dean talked about the advancements that Google and other tech companies—such as Microsoft and IBM—are making in the field, the promise that deep learning holds for everything from autonomous cars to medical research, and the challenges that lie ahead as the research continues.
The foundations are in place, such as technologies like GPUs and their massive parallel processing capabilities and the development of “nice, simple, general algorithms that can [enable neural networks to] learn from the raw data,” Dean said.
“The good news is that there’s plenty of data in the world, most of it on the Internet,” he added.
There are texts, videos and still images, searches and queries and maps, and data from social networks. All this data can be used to help neural networks learn, and adapt their behaviors to what they have learned.
Self-driving cars have been a constant conversational topic at the GTC event, and offer a good example of what has been done already and what still needs to be done. The advanced driver assistance systems (ADAS) in cars right now can detect when a collision is about to happen and apply the brakes, or determine when the car is drifting into another lane and alert the driver.
However, they will need additional capabilities before they are ready for everyday use. They need to be able to recognize whether an oncoming vehicle in the opposite lane is a truck or a school bus, and then react accordingly (for example, knowing whether the school bus is picking up or dropping off students, and stopping because the red lights are blinking). Or they need to be able to read that a car parked on the side of the road with its driver’s side door open could mean a person is about to get out of the car.
Much work around deep learning has involved image recognition—not only determining whether a photo in question is of a cat or a tree log, but how to describe the photo in a sentence (for example, a small child holding a teddy bear). There’s also work being done around voice recognition, understanding relationships between words to understand what is meant, not just what is said.
Google Exec Outlines Advances in Deep Learning
Dean said getting those relationships right and understanding context are important. When a neural network is talking about a car, “you want to get the sense of ‘car-ness,’ not that it’s spelled C-A-R,” he said.
Google is researching these and other issues, and there have been some significant gains, he said.
By way of example, he noted that in 2011, the best neural network at the annual ImageNet competition had an error rate of 25.7 percent. By 2013, the rate had dropped to 11.7 percent and in 2014, the rate was 6.7 percent.
In 2015, the Chinese Web services provider Baidu said it had reached an error rate of 6 percent. Earlier this year, Microsoft published a paper saying it had hit a 4.9 percent error rate. Earlier this month, Google published its own paper outlining an error rate of 4.8 percent.
“This is indicative of the kind of progress we’re making,” Dean said.
He also noted projects that were run using classic Atari games—such as Space Invaders—to determine how well neural networks learn without any pre-programmed data in front of them. When a neural network was set up to play the game, the initial results were as expected — the network was beaten pretty quickly. However, it learned each time it played the game, and after several hundred times, the neural network was essentially unbeatable.
“Eventually, it just can’t get killed,” Dean said. “It becomes superhuman.”
The networks in many ways learn as people do, by what they experience and by recognizing errors and correcting them. However, Dean said that while neural networks are inspired by the human brain, they are not made to mimic the brain. For example, he noted that when the brain tackles a problem, it only “fires” 10 times—essentially the thought process goes through only 10 different layers of highly parallel processing.
Neural networks often can have many times more layers of processing.