IBM: 8 Questions With the Company's New Big Data Honcho
So I would say the first order of business is making sure people understand the diversity of capabilities we have in this Watson Foundations package, because it is substantial. It really does have the industry’s only complete end-to-end big data package, including the integration and governance all the way through the decision management and predictive analytics. We also need to help people recognize the value of not just the robustness of the platform, but also the simplicity in which you can deliver new analytical insights on top of information. A great example is Blue Cross Blue Shield of Tennessee, which just announced some new performance results on top of our BLU Acceleration capability. They said they were managing actuarial data in a typical data warehouse database cluster and those workloads were taking two-plus days to run. They loaded that same information into our BLU Acceleration and those actuarial jobs ran in less than a minute. The powerful thing is not just the speed, but that they didn’t have to model the data, they didn’t have to worry about indexing, they don’t have to worry about reorganizing the data or about the schema definition. They essentially load it as if it’s a big spreadsheet and then run those queries against it and the thing absolutely flies. What role does IBM Research play in the company’s big data strategy? They play a huge role. Many of the innovations that we have put into the platform just in the last year are collaborations with our research teams. BLU Acceleration started about four years ago with a project that was running at Almaden Research, which is an adjunct to our database team. And it was looking at new ways to compress data but also leave it actionable--applying in-memory columnar capabilities, being able to do various data skipping and data reduction techniques all together in a way that improved the efficiency of navigating the information as you applied multiple techniques… one after another after another. That’s one of the reasons BLU Acceleration is so efficient, both in terms of compression but also being able to navigate large amounts of information.Research still plays a large role in innovating and growing our organic portfolio. We’re also applying innovations to things we have brought into the portfolio through acquisition. How have acquisitions helped with IBM’s big data strategy? I would say they have helped in two ways. First off, some of the approaches that we have taken are transformative to certain roles in the organization. If we were talking 10 years ago, the primary role for information management was a DBA or a chief technology officer. Now it’s people like a chief marketing officer or a chief sales officer–someone who is trying to look for bigger trends in the information they possess or the information they can garner from outside sources. So as you make that transition from going into a technically specific field to now a profession inside of a company you want to serve, you have to have solutions that make big data digestible to those parties, and the analytics has to be digestible for them. It really has to be consumable in a way that looks like the way they think about managing the business process of their profession. Acquisitions have played a key role in helping us round out these solution assets. The ones that came into the business analytics portfolio are around integrated risk and financial reporting and sales performance management, even employee effective with our Kenexa capabilities. The acquisitions also have allowed us to improve the solution relevancy for what I would classify as reference architectures in solving certain problem domains. The Now Factory was a good example of accelerating a big data approach to next-generation telco mediation by putting in a capability that would take raw call records and convert them to call data records that then could be loaded into a PureData for Analytics Netezza warehouse or into Streams or a Hadoop platform. Then they also had a set of applications that serve the network operator or the service representative or the chief sales officer to look for potential client churn or fraud or network quality aberrations that they would need to address. So acquisitions have helped us round out the reference architectures and helped us project to the business professionals.
BigInsights is another one where there are several technologies that emanated from research, including the capabilities around BigSheets on top of a Hadoop distribution. There are many great examples where this plays. In fact, part of the research contribution is we have been able to apply big data technologies to other workloads like security analytics. So our capabilities around our security portfolio now include big data techniques for being able to look for long term, persistent threats and correlation of events to try to look at whether or not unrelated things are really sophisticated, coordinated attacks.