IBM recently named Bob Picciano as senior vice president of its newly formed Big Data and Analytics Group. He is charged with leading a new business unit focused on continuing to drive leading-edge technologies into the marketplace to transform industries and professions. Picciano sat down with eWEEK senior editor Darryl K. Taft for a look into IBM’s new data-driven unit.
Tell us a bit about this new business unit you head up, the IBM Big Data and Analytics Group?
First off, Ginni [Rometty, IBM’s CEO] sees a set of transforming forces and she is positioning the IBM company to take advantage of the momentum in the marketplace. At the beginning of the year, she said there are three things that are happening and we’ve been investing in those things for some time. It’s all about data–data-driven transformation, it’s about cloud and it’s about systems of engagement.
In many ways cloud is the how, engagement is the why and data is the what. In fact, one of the things I’ve noticed in the last nine months or so–and I’ve been doing data since 1987–is now everything is all about top line growth. Maybe three or four years ago working with data was more about how I manage things more efficiently. Now it’s all about top line growth. Clients are all about building growth agendas, whether it’s on understanding the information that they currently have or combining it with outside information to be able to attract or retain clients, or whether it’s creating a new business model about the data they manage, or using it to improve the economics of IT as opposed to trying to squeeze cost out of the data layers themselves.
And the fast-moving, high-variety nature of data and information means you have to have more robust, more integrated analytics to make sense of all the information–structured, unstructured, polystructured.
So Ginni organized the company, specifically the Software Group, around those transformative tenets. And basically set the rest of the company as ecosystems around those bases. I have the responsibility for the Information and Analytics Group, which includes Watson Foundations, which is the platform we use to curate information for big data problem solving, but also to generate it and feed it to our cognitive capability in Watson. Mike Rhodin has the Watson Group and that’s about that next era of computing. And Robert LeBlanc has the cloud and software solutions organization.
So it’s a bit of a remix of how we were positioned before. And included in my group are all of the business analytics capabilities, all portfolios around risk management and performance management, business intelligence, predictive analytics, prescriptive analytics, next-generation visualizations, the enterprise content team–which is all the unstructured information, including advanced case management capabilities and content navigation. I also have the portfolio of the information management group, which I was managing up until January of this year.
What are your plans for your new role? What do you plan to focus on?
The first thing is to make sure the marketplace understands our capabilities in being able to curate information for advanced analytics and problem solving. Last year we steadily introduced into the market several new capabilities to allow people to have more confidence in using big data technology to get their arms around the information they have inside of their business and then apply analytics to all that information. We came out with capabilities like the privacy suite for Hadoop, which allows people to apply data lineage and providence and also advanced security to Hadoop data. We made several innovations in our BigInsights and our InfoSphere Streams and our data exploration platforms. We introduced new, more agile capabilities to manage information in data marts and data warehouses, which is an important part of everyone’s analytical system of record.
But I see more and more that there are really three things working together to help people curate their information: it’s an analytical system of record, it’s a Hadoop environment such as our BigInsights portfolio, and it’s being able to manage the information in motion to seize a critical business moment. So we’ve now put this together into the Watson Foundations package and have announced that to the market and to our partners for them to be able to curate information and apply analytics–sometimes directly pushed down into the data, other times as part of a data supply chain–and then help our clients build that on-ramp to using cognitive computing.
IBM: 8 Questions With the Company’s New Big Data Honcho
So I would say the first order of business is making sure people understand the diversity of capabilities we have in this Watson Foundations package, because it is substantial. It really does have the industry’s only complete end-to-end big data package, including the integration and governance all the way through the decision management and predictive analytics. We also need to help people recognize the value of not just the robustness of the platform, but also the simplicity in which you can deliver new analytical insights on top of information.
A great example is Blue Cross Blue Shield of Tennessee, which just announced some new performance results on top of our BLU Acceleration capability. They said they were managing actuarial data in a typical data warehouse database cluster and those workloads were taking two-plus days to run. They loaded that same information into our BLU Acceleration and those actuarial jobs ran in less than a minute. The powerful thing is not just the speed, but that they didn’t have to model the data, they didn’t have to worry about indexing, they don’t have to worry about reorganizing the data or about the schema definition. They essentially load it as if it’s a big spreadsheet and then run those queries against it and the thing absolutely flies.
What role does IBM Research play in the company’s big data strategy?
They play a huge role. Many of the innovations that we have put into the platform just in the last year are collaborations with our research teams. BLU Acceleration started about four years ago with a project that was running at Almaden Research, which is an adjunct to our database team. And it was looking at new ways to compress data but also leave it actionable–applying in-memory columnar capabilities, being able to do various data skipping and data reduction techniques all together in a way that improved the efficiency of navigating the information as you applied multiple techniques… one after another after another. That’s one of the reasons BLU Acceleration is so efficient, both in terms of compression but also being able to navigate large amounts of information.
BigInsights is another one where there are several technologies that emanated from research, including the capabilities around BigSheets on top of a Hadoop distribution. There are many great examples where this plays. In fact, part of the research contribution is we have been able to apply big data technologies to other workloads like security analytics. So our capabilities around our security portfolio now include big data techniques for being able to look for long term, persistent threats and correlation of events to try to look at whether or not unrelated things are really sophisticated, coordinated attacks.
Research still plays a large role in innovating and growing our organic portfolio. We’re also applying innovations to things we have brought into the portfolio through acquisition.
How have acquisitions helped with IBM’s big data strategy?
I would say they have helped in two ways. First off, some of the approaches that we have taken are transformative to certain roles in the organization. If we were talking 10 years ago, the primary role for information management was a DBA or a chief technology officer. Now it’s people like a chief marketing officer or a chief sales officer–someone who is trying to look for bigger trends in the information they possess or the information they can garner from outside sources.
So as you make that transition from going into a technically specific field to now a profession inside of a company you want to serve, you have to have solutions that make big data digestible to those parties, and the analytics has to be digestible for them. It really has to be consumable in a way that looks like the way they think about managing the business process of their profession. Acquisitions have played a key role in helping us round out these solution assets. The ones that came into the business analytics portfolio are around integrated risk and financial reporting and sales performance management, even employee effective with our Kenexa capabilities.
The acquisitions also have allowed us to improve the solution relevancy for what I would classify as reference architectures in solving certain problem domains. The Now Factory was a good example of accelerating a big data approach to next-generation telco mediation by putting in a capability that would take raw call records and convert them to call data records that then could be loaded into a PureData for Analytics Netezza warehouse or into Streams or a Hadoop platform. Then they also had a set of applications that serve the network operator or the service representative or the chief sales officer to look for potential client churn or fraud or network quality aberrations that they would need to address. So acquisitions have helped us round out the reference architectures and helped us project to the business professionals.
IBM: 8 Questions With the Company’s New Big Data Honcho
How does IBM monetize big data and how do you see that evolving?
Today we are primarily a supplier of big data insight and analytic capabilities to clients we serve, to allow them to build their own solutions around what we’ve categorized as six entry points of big data solution areas. Those entry points are: Creating new business models, customer insight, improving IT economics, optimizing operations and reducing fraud, managing risk, and transforming financial processes. That’s where clients are using the majority of these big data insight information capabilities today.
The most interesting one is helping them create new business models where they’re monetizing information they may have had on hand for a long time, but they didn’t know to correlate that with other things that may be happening in a social domain or that could be combined or correlated with other information that they possess.
That’s a today statement in terms of how we help our clients utilize those capabilities. But when we talk about capabilities like Watson we really are talking about delivering those insights as a service to our clients. In that type of a model, being able to curate the information necessary to help our clients get the highest quality insight is growing in terms of its relevance. Whether it’s engagement advice or whether it’s helping clinicians identify probable causes or possible solution paths, those are things we are selling back as a cloud-based service over time.
So I think we’re in one of those periods where we’re seeing some of the approaches transition from traditional Software as a Service and traditional on-prem models to really pioneering some information-as-a-Service models.
So you are early in the game for this Information-as-a-Service play?
In the latter phase, we are. And it comes through a couple of different, interesting paths. Certainly through insights like our Kenexa solution. Kenexa has a really relevant corpus of information that it has gathered over many years of behavioral science and survey science. And now, just at the beginning of this year, Craig Hayman [IBM general manager of Industry Cloud Solutions] announced that we are applying big data techniques to what Kenexa delivers. It always delivered Software as a Service, but now we can go much deeper in the insights that we can unleash to the chief human resources officers of enterprises we want to serve.
The same is true around our capabilities in security with what we do to deliver our QRadar solutions in our security portfolio. There are also examples of this in Coremetrics in Smarter Commerce. So Watson isn’t the only one, but I think over the course of the last two years we’ve applied many more big data techniques to these other more vertical solutions to give a big data advantage to the data sets that they’ve always been on top of.
Where would you say IBM stands competitively in the world of big data and analytics?
I think we’re No. 1. I think if you look at it by a revenue metric, if you look at it by who is recognized as the leaders in this space measured by people like Wikibon or if you look at how many favorable mentions are delivered into the marketplace, we would be No. 1 by any of those measures.
Our big data capabilities have been growing at very high rates. These are smaller businesses compared to some of our well-established data management portfolios, but these are new and growing extremely fast. I think 2013 was an important year of establishing real relevance and a strong set of customer outcomes around our big data portfolio. I would say it’s in part because of our ability to embrace Hadoop as we have and created a great offering around Big Insights, which is our enterprise Hadoop distribution.
We’ve also unleashed ways for our clients to engage with IBM to just learn about this space–through things like the Big Data University, or our Big Data Stampede engagement offering that allows clients to experiment with the technology or techniques and learn about how it can be applied to any of these entry point areas to deliver an enterprise benefit for their organizations.
I would also say that capabilities like our Data Explorer have also ensured that our clients can drive really quick wins. Not everything is established with a Hadoop distribution that takes a long time to build. Things can be done very quickly to allow us to help our clients navigate to where that relevant information is and how to apply it in context to other information for problem solving. These are really important in customer service examples and areas where data is scattered around the enterprise.
IBM: 8 Questions With the Company’s New Big Data Honcho
One of the other important things, probably the most important thing around big data, is really a space where IBM leads by leaps and bounds and it’s what I call the data-in-motion space or streaming analytics. This is our InfoSphere Streams technology that allows the application of predictive analytics directly in the transaction path. It’s a combination of complex event processing and predictive analytics, which is transformative to things like next-generation telco mediation.
You may have seen it first-hand in some of the examples we’ve talked about in health care over the years like Project Artemis or just in the fall Dr. Tim Buchman came out of Emory Hospital and talked about his work to transform patient monitoring and the intensive care unit’s use of InfoSphere Streams. But it’s relevant in connected car, in car-to-car and car-to-infrastructure connectivity. It’s transformative in oil exploration, helping our clients monitor yields off devices. And the applications will continue to expand as the Internet of things drives more and more volume and increases the importance of being able to take the signal out of the noise of all that the Internet of things will be generating.
Can you go a little deeper with where cognitive computing comes into play and what we can expect to see in the future?
When I talk about the advanced analytics space, we really possess a very robust range of analytics capabilities. We have things that are in what I would call the descriptive or business intelligence aspect, such as visualizations or what’s on the glass. Then you move into decision management, discovery and exploitation, then you move into predictive and prescriptive modeling–understanding what could happen, why is this happening, what action could I take.
All of these things work on tabulating computing models and understanding the path statistics and being able to correlate into quintiles off of structured information. We do a superb job of helping our clients do this. We introduced new offerings like Predictive Maintenance and Quality to help people identify when they need to take prescriptive action as opposed to reactive action. These are all very strong fields, but there is a leap that you go through at some point to say “what is the record of mankind in this field and how do I correlate everything that I’ve learned to identify what’s best?” That’s where cognitive computing comes in. Because the record of man is typically in unstructured textual information; it’s not typically tabular. It does have tabular markers in it, but the nuance around the tabular information is important as is deriving context from that information.
So when you put these things together you’re able to combine the speed and agility of putting predictive and prescriptive analytics in the business process, along with also being able to pull in the business record of everything our human journey has created as relevant information of learning and then being applied to a set of actions. The most compelling examples, and the ones I think are most viscerally understood, are the ones around health care. But that’s not all; there also is wealth management, simple consumer or citizen service, legal applications and also recipes. Watson can help you cook. It can create recipes and things that allow it to understand what’s being described in a recipe that is separate from the structured information. The sequence and the chemistry is defined in the language not necessarily the ingredients and measures. So it’s really when you need to make that jump to figure out what all have I learned and how do I apply that to what would be the best outcome of a situation.
In those sorts of systems where the information is so varied, you have to train the system to understand the patterns and teach itself and create new algorithms from the information it possesses. I think one of the most relevant things that Watson does is it will give you a conversation of additional information it needs to give you a high confidence in its answer. So it will tell you, “based on what I see I know X, Y and Z, but if you answered these other questions I could give you a much more confident answer.” I think that’s something that as productive as our current professions are, they’re also under duress. So having a bit of extra guidance and that kind of assistance is very relevant.