It generally takes a lot of reps (repetitions) for a human to become really good at something. Playing the violin, swinging at (and hitting) a hard-thrown baseball and dropping back to complete a long pass in football are examples of this.
Similarly, it takes a machine a lot of reps to be able to remember a data set and then bring it to the fore when someone needs the information. A lot of people don’t realize this.
San Francisco-based startup Figure Eight knows all about this practice and specializes in teaching artificial intelligence engines how to perform optimally, and it does this through video reps.
Figure Eight, which describes its product as a “human-in-the-loop machine learning platform,” on Aug. 14 launched its ML-assisted Video Object Tracking solution to accelerate the creation of training data for customers in key industries such as automotive and transportation, consumer goods and retail, media and entertainment, and security and surveillance.
Up to Now, It’s Been a Tedious Process
Until now, this process has been painstakingly slow, tedious and expensive, requiring every object in every frame to be labeled by a human. Depending on frame rate, the number of objects in each frame, and the complexity of the video, this process can take hundreds or thousands of hours just to annotate one hour of video.
You may have seen the example of a rolling robotic device that moves up and down rows of plants in a field, recognizes weeds, and pulls them out. Or how a robotic-arm machine recognizes labels on packages in a warehouse and knows how to grab it and move it to the correct conveyor belt. Having all this grunge work done by non-humans makes production systems all that much more efficient for enterprises.
Figure Eight took its name from the continuous feedback loop between humans and machines, CTO Robert Munro told eWEEK.
“Probably 90 percent of real-world artificial intelligence being used in the world is being powered by humans,” Munro said. “And this is across any use case you can imagine. The reason a self-driving car recognizes a pedestrian on the side of the road is because humans have told it so in many hours of video—what a lane is, what a pedestrian is.
“The same is true that your phone understands when you tell it to search for directions, because thousands of humans have given it those same directions thousands of times before.”
Data Sets Comprise the Body of ML Knowledge
It’s the aggregate data sets in cloud storage systems and in individual devices that comprise the body of knowledge that an AI uses to provide its functions. Figure Eight does its training using video loops over and over and over again, until the AI engine recognizes everything in the video it should know. Figure Eight then shares the knowledge throughout its cloud service.
The Machine Learning assisted Video Object Tracking package enables users to annotate an object within a video frame and then have machine-learning predictions persist that annotation across frames within the video, Munro said. Human annotators can review the machine predictions and adjust where necessary to deliver video annotations that are highly accurate but up to 100 times faster than human-only solutions, in which every object in every frame must be hand annotated.
So it sees an image, remembers it and is able to multiply that image 100 or more times in order to speed up the annotation process.
“The only viable solution to creating high quality training data at scale is to combine the best of machine learning and human intelligence,” Munro said. “We’ve spent the last year integrating a deep-learning ensemble model into the Figure Eight platform so we can apply billions of compute cycles to the billions of human judgments previously generated for computer vision and natural language processing use cases.
“The result today is that our customers can now create training data up to a hundred times faster than previously possible. By applying machine learning to the quality-control process, we also annotate with more accuracy than purely manual processes, giving our customers the best of both worlds: scale and accuracy.”
ML Isn’t Some Magical Code
This article is finished. Hopefully you know more now about how machine-learning works and that it’s just not some magical secret code that makes it all go.
Figure Eight’s Machine Learning assisted Video Object Tracking is available now for all customers as part of the Figure Eight platform.
To see a Vimeo video about how this works, go here. For more information, go here.