NEW YORK—Here you have a healthy appetite. Since the Informix deal in 2001, IBM has acquired a whopping 16 companies in its Information On Demand initiative.
But that aint nuthin.
It capped off the buying spree on Feb. 16 by announcing a three-year, $1 billion, 25,000-body effort to turn Information On Demand into packaged products.
In other words, its taking this stuff seriously. eWEEK Database Editor Lisa Vaas caught up with Steve Mills, senior vice president of IBMs Software Group, to find out how exactly IBM plans to tackle the problem of quickly serving up relevant information out of the roomful of noise thats coming from burgeoning data stores.
The problem of analyzing burgeoning business data stores isnt new. Whats new here in what IBM is announcing today?
The capability that exists today in terms of technology allows solving problems that couldnt be solved easily a few years ago. The cost of solving problems, the challenge of the quantity of data and [the speed with which it can be accessed, analyzed and delivered], along with the growing speed of hardware, now allows things to be [achieved] that couldnt before.
The ability to do it rapidly and in real time and to provide answers back [quickly] starts to allow [scenarios such as] real-time crime fighting capability. Thats a speed and performance issue.
You have squad cars with wireless connections. Theyre requesting information, having to sift through large numbers of records, often millions of records, sifting through information that gives you probable matches to people who might be associated with a crime. That kind of real-time crime fighting wasnt possible years ago. The bandwidth and the software sophistication wasnt [available] years ago.
In a panel presentation, Scott Vanderhoef, county executive of New York State Departments Rockland County, also talked about setting up, with IBM, Verify NY, a new county-specific data-mining computer system to weed out fraud in the states Medicaid program. What are some of the examples of noise—irrelevant data—you have to deal with in that scenario?
[With the Medicaid project], the tough part is trying to figure out if a problem was from a [healthcare] provider or an individual citizen. The problem might seem like theyre overclaiming, or double submitting.
Given IBMs investments, both in acquisitions and in todays announcement, the company must see this as a huge opportunity.
Weve been investing quite heavily over the last five years in getting access to data in all forms, in all places. How do you bring it together? How do you apply sophisticated algorithms to information?
Theres a big business opportunity associated with this. Because technology and know-how have come together to allow us to do things we couldnt do years ago.
The $1 billion investment on software, the 15,000 [IBM experts] on the service side, with another 10,000 [personnel] coming, the reason for that is we in fact see these things coming together. The need is there. Weve been building know-how in different domains.
Whether its government entitlement programs, discovery in various industry segments, healthcare getting a huge amount of attention, common healthcare records—being able to do manipulation for outcome-related treatments so you understand what different therapies will result in, so you can compare data, so you can understand what to prescribe for someone. All this will be information-based. It will be huge amounts of information to be correlated, related, manipulated.
The next wave of innovation is going to come through various aspects of information understanding, manipulation and decision making, in real time.
Next Page: Servers and On Demand.
Servers and On Demand
Youre announcing today two new server products. How do those fit into the Information On Demand initiative?
On the information integration side we have our [new WebSphere Information Server] capability, focused on all different types of data, whether its a different form of numerical data, relational data, file data, classic information, textual information, access to text, being able to read, to manipulate text, to find data buried in text, being able to deal with voice, video.
Dealing with a wide range of data types. Being able to link to all different data types, being able to map them, to extract data from sources, to cleanse them, to scrub out duplicates and redundancies, all comes together around WebSphere server technology, where weve combined core capabilities weve acquired and built over the last five years: Venetica, CrossAccess, Ascential, [etc.]. Acquiring different elements necessary [to the Information on Demand initiative].
The second side of this has to do with manipulation and discovery and deeper reconciliation within context. Its search technology, thats what [WebSphere Content Discovery Server] is all about—being able to search various types of data.
Then in sophisticated analytics, such as the entity analytics capability that deals with human relationships, that understands your record, my record, how they might relate to other records, whos who and are we who we say we are.
In this current world of identity theft, and where human identity tends to move around, it becomes a critical issue to all sorts of applications.
Youre talking up a lot of initiatives in a lot of industries, but when are we going to see packaged solutions come out of all this investment?
As we see the problem, this is less about core technology and more about applying it in a business context easily and quickly. Projects being discussed today, Id characterize them as the beginning of projects. In some cases theyre first of kind initiatives to understand what is a problem and how do we apply technology to solve that problem.
In financial services, we do sophisticated time-based analysis, complex analytics that deal with derivative understanding. How do you create new financial instruments? How do you model markets? Were beginning to hone down to a set of preverified packages.
[For example], Verify NY: The idea was to understand the problem and what technologies you could apply. Could you deal with data, reconcile data, find patterns in data, make it useful to administrators, and deal with citizens submitting claim and the provider submitting a claim? That whole process of bringing it all together, was done as a first of kind project effort.
Now what were doing is hardening this and making it repeatable. Its not about understanding mathematics, its about making it repeatable and making it available in the market.
Some see the payback of saying well invest with you, IBM, well open our doors [to work with IBM to define the problem].
For many businesses, theyre looking for us to come forward with something much more prepackaged. In coming years it will be us coming up with technology packaged in something industry-specific, whether its in real-time crime fighting, in financial data giving a common view of the customer—multichannel—in manufacturing and distribution, product data. How do I deal with suppliers? Do I have redundancy? Is the supply chain clogged with duplicate material?
To find lots of data in different formats, you find similar algorithms needed for sorting the data, but an applied use to deal with individuals in that area … in Medicaid, [for example,] youre dealing with government employees. They dont have to learn algorithms themselves.
Were prepared today to offer to the state of New York [a solution] because its counties pretty much administer Medicaid in the same way. The next step is to try to understand how applicable that is to other states. Trying to get a front-end environment so it can be used in various use scenarios.
Its making the front end adaptive. This is a classic portal-type interaction environment. You can manipulate portlets, move things around, change structure for look and feel. Its very adaptive. We have to go through the process of improving adaptability.
In every area of entitlement programs, you find the same iterative problem taking place. Once you know the domain and the way data is supposed to look and how its not supposed to look, you can adapt the capability to match the set of requirements.
Youll see us do adaptation, packaging. Well dedicate people to specific solutions. Thats what scale-out is all about.