Its a food-processing company.
Like lots of companies in these days of consolidation, it worried about melding two merged companies collections of databases, legacy applications and data sets.
With some 7,000 employees, six processing plants and a tidy collection of farms, getting to a centralized data repository meant gluing together separate ledgers, separate payrolls, separate inventories—really, the whole nine yards, with the added worry of cleansing data and deciding whether business processes would need to change.
The company got there. The entire suite of products in IBMs newly acquired Ascential line is now chugging away, doing data cleansing, transformation, staging and loading beneath a data store that runs between 20GB and 30GB and will hit about 200 by the time theyre through.
A data integration specialist who requested anonymity said the company is doing it the IBM way—federated, with data staying put on Microsoft Corp. SQL Server databases, and they are, yes, getting what enterprises keep saying they want: one version of the truth.
The company is a happy customer. But is its story in fact a reflection of the data-integration nirvana that IBM and other companies are hyping?
In that version, integrating the processes of IT and integrating business processes no longer belong to siloed products.
In their place, we have platforms, like IBMs WebSphere group of products, that promise to do it all: ETL (extraction, transformation and loading), EAI (enterprise application integration), EII (enterprise information integration), data quality and data profiling.
The food-processing companys situation, like other Ascential customers, is not, in fact, indicative of this nirvana.
Rather, they are satisfied customers on IBMs IT-process side of things.
Some such customers are looking wistfully toward the promised land of mega-data-integration, but they dont believe that theyll get there anytime soon, and they dont even think that were all using the same definition of the things we need to get there.
“When I say metadata, and this is perhaps specific to the [data] warehousing construct, its not just data lineage,” said Danny Siegel, senior manager of Finance Business Technology at the pharmaceutical giant Pfizer Inc., which is another Ascential customer.
When Siegel refers to “data lineage,” he speaks about the IT-process side of data integration, where metadata describes things such as where data originally resided, who touched it when, and other things relating to its journey from its database source toward its end destination in the data warehouse.
What he does mean when he says “metadata” is the business side of things: the business logic that explains the end-stage data in plain English, in plain tell-it-to-the-executives speak.
“When we say metadata, were describing business process and logic,” Siegel said. “Thats what the problem is, for us: You have a metric at the end of the road in a warehouse. It gets there somehow. Theres a technical aspect to it. Theres a business aspect to it. The technology cant be communicated to anybody but an analyst. How do you make it intelligible to … executives?”
IBM Is on Its
Way to Becoming the Market Leader”>
Following its acquisition of Ascential, IBM is well on its way to becoming the leader in the data integration market. “[The acquisition] underscores a broader market trend that eschews siloed data management activities (e.g., ETL, EAI, EII, data quality, data profiling) in favor of an integrated information management strategy,” writes Mark Beyer, an analyst with Gartner Inc. “Once this acquisition is fully integrated … IBM will emerge as a leading provider of such a platform.”
Thats because Ascential is further along the road toward melding technical metadata with plain English, business-process metadata.
But, customers and analysts say, even Ascential hasnt yet gotten over the gobbledygook stage, and nobody knows quite how far its next-generation tool set, “Hawk,” will get, either.
“In my experience, with the metadata tools Ascential has, thats great for the guy who has to maintain the processes,” Siegel said. “For business users, its gobbledygook. Its database-speak. Its talking about an ETL process. Youre not talking about a business process.
“Theres a distinction between what [IBM] calls metadata and what they call metadata,” he said. “Im not saying the latter isnt valuable. Im saying their suite of tools do not address that. I dont know anybody who does now.”
Will IBMs “Hawk” portfolio address the disconnect? Siegel thinks not.
“It sounds good,” he said. “But come in to an enterprise as big as Pfizer. Its not going to happen. Not quickly. I suppose its possible. With the right amount of time and money, anything can be done.”
Much of the problem, to Siegels mind, lies in the need to apply metadata logic on the fly. “Some [of the metadata] is esoteric and business-specific and hard to apply,” he said. “That said, I think [Ascentials DataStage] toolset is excellent at doing those kinds of things.”
Still, DataStage isnt addressing events in real time, Siegel said. “You can make DataStage real time, but thats only great for fairly small things, to move things around. Its great for automation, for usability, but you couldnt call it quote-unquote at run time. Somebody goes into a Web page and wants to summarize a gig of data. Youre not going to do that at runtime. Nobodys going to find their way around the slowness of a drive.”
There will always be a place where you have to do some caching so somebody can find the data in a reasonable amount of time, Siegel said. Whether its an internal table or what have you, there will always be a middle piece with large data sets.
That said, Ascentials DataStage has been a pioneer in making things that were once batch-driven become service-oriented, Siegel said.
Hence, hes planning to do what many data-integration aficionados are planning to do: cross his fingers about IBM staying on track to integrate Hawk, remind IBM that Ascential overinvested in development, vote for IBM keeping development staff and management in place, and hope for the best.