The explosion of data from social media, video, audio, wearables and the Internet of Things (IoT), has innovators focused on new ways to make data more meaningful and valuable.
Machine learning is helping, with software engineers continually creating and improving algorithms that automatically analyze data to identify patterns and predict outcomes.
Machine learning involves studying computer algorithms that provide computer programs with the ability to learn, discover, predict and improve automatically using large amounts of data without explicit programming, said Dave Schubmehl, research director for Cognitive Systems and Content Analytics at IDC.
“Machine learning starts with data—the more you have, the better your results are likely to be,” said David Chappell, in a white paper on Microsoft’s Azure Machine Learning solution. “Because we live in the big data era, machine learning has become much more popular in the last few years. Having lots of data to work with in many different areas lets the techniques of machine learning be applied to a broader set of problems. Since the process starts with data, choosing the right data to work with is critically important.”
Herain Oberoi, director of product management at Microsoft, described an existing use case where an organization is studying the problem of customer churn by doing an in-depth analysis based on Web logs and click streams.
“They store their Web log information in their big data system, do some processing on it, run a model against it and then predict from a customer’s browsing and clickstream history whether they’re likely to churn or not,” he said. “That’s an example of an existing use case that machine learning makes better.”
What’s happening, Oberoi said, is “a convergence between the cloud, big data analytics and machine learning to enable a new set of use cases and [to] simplify an existing set of use cases.”
Machine Learning Is Not New
Although machine learning libraries have been around for decades and have been offered as part of many statistical packages, including IBM’s SPSS, SAS and many others, the use of machine learning by enterprises has not been widespread until recently. That’s because the analytical algorithms require a lot of data and a lot of compute power, Schubmehl said.
However, many leading technology firms, such as Google, Facebook, Amazon, Baidu, Yahoo, Walmart Labs and others have been using machine learning tools over the last few years to improve analytics applications in areas, such as image recognition, programmatic advertising, as well as product and content recommendations.
“Enterprises haven’t been as quick to adopt machine learning and are now doing so as part of their big data efforts,” Schubmehl said. “Probably the biggest use of machine learning to date is for data categorization, discovery and cleansing.”
Machine learning in general is one—albeit highly important and necessary—component in a new generation of “smart” applications that have cognitive capabilities—applications that are able to recognize patterns in data, documents or even images.
Other technologies, in addition to various types of machine learning, include content acquisition/aggregation, text analytics, speech analytics, knowledge graphs and question and answer systems.
Companies like IBM, Cognitive Scale and TCS are including machine learning capabilities in their cognitive systems platforms. They allow developers and enterprises to build applications capable of learning and improving their performance over time.
According to Mike Gualtieri, a principal analyst at Forrester Research, slightly fewer than half of enterprises report using predictive analytics. However, verticals such as retail, travel, financial services, law enforcement and others have been quick to try their hands at machine learning and predictive analytics to solve problems, reach bigger audiences and catch criminals.
Machine Learning Shaping the New World of Cognitive Computing
“Predictive analytics use a combination of statistical and machine learning algorithms to build predictive models,” Gualtieri said. “There is tremendous interest in using machine learning but most enterprises are still trying to understand what the heck it means.”
For now, he said “machine learning is mostly for data scientists, but a few vendors are trying to democratize machine learning for business people and developers. IBM’s Watson question answering system and Skytree are examples.”
Getting “Cognitive” With Watson
There is perhaps no more prominent example of cognitive computing than IBM’s Watson, where machine learning is but one component of the system’s overall cognitive package.
The original system underlying Watson was built on one API covering five technologies: natural language processing, machine learning, question analysis, feature engineering and ontology analysis. Yet now the Watson platform and ecosystem continues to grow with more than 25 APIs and services underpinned by more than 50 technologies in four key areas: language, speech, vision and data insights.
“The breakthrough that we had when we formulated Watson for Jeopardy was to avoid depending on rules and ontologies,” said Rob High, IBM’s CTO of Watson. “We completely divorced ourselves of that and looked at the problem entirely as a signal processing problem,” High said. Instead Watson works by measuring signal patterns—the patterns of signals and linguistics—and using the results to predict the meaning of that pattern, a process that depends heavily on machine learning, he explained.
Cognitive computing is in turn rapidly becoming a cornerstone in IBM’s future. Last month, IBM CEO Ginni Rometty announced that IBM is becoming a cognitive business.
High said this means that IBM’s focus “will be fundamentally based on tapping into the vast quantities of information out there that previously could not be processed with traditional computing techniques, but rather could only really be understood using techniques that come closer to recognizing the attention that humans would normally ascribe to that information.”
“Developers and organizations are just beginning to understand and utilize the power of machine learning,” said IDC’s Schubmehl. “A number of organizations have decided to use machine learning inside of cognitive system platforms like IBM Watson.”
High said there are about 80,000 developers now using Watson cognitive computing APIs to build apps featuring cognitive technology.
Doug Schaedler, CEO of Inno360, an IBM Watson ecosystem partner, told eWEEK his company is currently using seven Watson APIs and is planning to use seven more to support its cognitive solutions aimed at researchers working on specialized projects.
The seven Watson APIs Inno360 is using include: Keyword Extraction, Concept Tagging, Taxonomy, Sentiment Analysis, Relationship Extraction, Linked Data and Entity Extraction.
Schaedler said Inno360 came into being because Proctor & Gamble needed software to better connect its thousands of researchers and make them more effective. Inno360 started developing a system to serve that need by integrating Watson APIs in July 2014 and now relies on it. “Watson provides deep analysis and cognitive capability in the background,” he said. “It helps the whole enterprise get smarter.’
That element of adding “smarts” to systems and applications is essential to machine learning. “Pretty much any application that you’re building can get some intelligence,” Microsoft’s Oberoi said. “Most apps can benefit from that kind of intelligence introduced by machine learning.”
Microsoft is investing in machine learning in a number of different ways. The company uses machine learning in most of its products—from Xbox to Bing and even in Outlook, Oberoi said.
At the broadest level, the company has released its Cortana Analytics Suite, which is an end-to-end big data and advanced analytics platform. The company also is integrating that suite with its Cortana personal digital assistant which uses machine learning.
In conjunction with Cortana Analytics Suite, Microsoft also offers its Azure Machine Learning Studio, a collaborative, drag-and-drop tool to build, test and deploy predictive analytics applications.
Machine Learning Shaping the New World of Cognitive Computing
Machine Learning Studio publishes models as Web services that can easily be linked to custom apps or business intelligence (BI) tools based on Excel spreadsheets. The technology provides an interactive, visual work space to easily build, test, and iterate on a predictive analysis model.
The predictive analytics software solution can be embedded in applications or used along with the cloud-based Microsoft Power BI analysis solution.
Azure Machine Learning Studio is a component of Azure Machine Learning (Azure ML), which is a cloud service that helps users implement the machine learning process. It runs on Microsoft Azure and can work with very large amounts of data and be accessed from anywhere in the world.
“The idea of machine learning has been around for quite a while,” Chappell said. “Because we now have so much more data, machine learning has become useful in more areas. Yet unless the technology of machine learning gets more accessible, we won’t be able to use our big data to derive better solutions to problems, and thus build better applications,” he said.
“A primary goal of Azure ML is to address this challenge by making machine learning easier to use,” said Chappell. While data scientists are still required to make these systems work, the Azure ML cloud service can help less-specialized people play a bigger role in bringing machine learning into the mainstream, he said.
“Going forward, expect data-derived models to become more common components in new applications,” Chapel said.
Many organizations are using Azure ML to dip their toes into the machine learning waters. For instance, Northwestern University’s Kellogg School of Management uses Azure ML to introduce MBA students to predictive analytics.
Professor Florian Zettelmeyer, director of the Program on Data Analytics at Kellogg, said he uses Azure ML with his students because its drag-and-drop interface and library of sample experiments and algorithms offer a smoother transition to advanced analytics than requiring students to become technology specialists and learning to code.
In another example, Microsoft, based in Redmond, WA, is working with the nearby Tacoma Public School District to use Azure ML to provide predictive analytics on which students may be at risk of failing a course or dropping out. The Microsoft Data and Decision Sciences Group (DDSG) created a proof-of-concept (POC) data model centered on Azure Machine Learning.
Still Early Days
So while machine learning shows an enormous amount of promise for helping enterprises become smarter, it is still early days.
Gualtieri warns against thinking of machine learning as a singular approach to analyzing data. There are dozens of specialized classes of algorithms that focus on specific problem domains, he said. For example, some machine learning algorithms are designed to analyze images or video to identify objects or predict emotional state from facial expressions.
Others are used to make personalized product recommendations for customers. Search engines use machine learning algorithms to continuously improve search results. What makes machine learning algorithms unique is that they are designed to identify patterns or make predictions by analyzing historical data that is representative of the domain, he said.
Also, “Don’t get tripped up on the ‘learning’ part of machine learning,” Gualtieri said. “Learning means that the algorithms analyze sets of data to look for patterns and/or correlations that result in insights. Those insights can become deeper and more accurate as new data sets are analyzed by the algorithms.”
For his part, IBM’s High notes that 80 percent of the world’s data is in forms that traditional computing systems have not been able to interpret properly for their meaning, which only cognitive computing effectively discern.
That is why machine learning “is germane to every industry that is today subject to qualitative information—sometimes referred to as unstructured information—which includes stuff that humans write down to express their ideas, vocal expressions of human language and visual expressions of human language,” he said.
“We need a system that is capable of tapping into all the data that today we essentially ignore or let pass us by because we don’t have the time for it.”