Garbage In, Garbage Out of Control

Opinion: You think it's bad now, the incorrect data that's floating around at insurers, government agencies and data brokers like ChoicePoint? Just wait until they link all those databases up.

I just read Baselinemag.coms excellent story on the rising threat posed by bad data thats stored in myriad databases across the land: registries of motor vehicles, insurance firms, marketing companies and other commercial sources, as well as public records such as court documents and licenses.

My thoughts? Be afraid. Be very afraid.

Know this: Personal identity information is full of inaccuracies, typos and outdated information that can, at the merely annoying end of the spectrum, plague innocent citizens as they get turned down for insurance or have credit applications denied.

Things turn truly Orwellian, however, when you get into scenarios outlined by, in which innocent victims of dirty data suffer much more traumatically.

Case in point: Steven Calderon got tossed into jail to rot for a week in January 2002 for felonies he didnt commit, including rape and child molestation.

The problem? Police, and Calderons employer, Frys Electronics, believed data aggregated and supplied by ChoicePoint, rather than the evidence in front of their eyes, which would have told them that Calderon didnt match the perps height, weight, drivers license number or fingerprints.

I wish I could tell you that this was a database problem and that technology vendors are all over the problem of cleansing this data.


Granted, ETL (extraction, transformation and loading) vendors are all about fixing the mess that passes for data in these discrete databases. But whatever achievements we get from that camp will still leave us struggling fiercely against the tide when it comes to the urge to merge these soiled little buckets.

/zimages/3/28571.gifClick here to read about data theft at MCI and its influence on the encryption debate.

Because aggregation is happening all over the place, linking these databases together regardless of the power and range it gives to the propagation of dirty data.

You get it on the technology front, of course, with admittedly splendid analytics applications coming from companies such as SAP.

When I spoke recently with Roman Bukary, leader of SAPs xApps and Analytic Applications product marketing, he told me that this is what its all about: going from standard analytic reports to composite analytic applications.

What does that mean? It means business users can initiate and take action on workflow applications inside analytic applications. In other words, with the upcoming merging of technologies such as Microsofts and SAPs in the Mendocino product, youll be able to be flitting around in Office and decide to give somebody a pay raise without having to leave to go fiddle with the SAP HR module.

It means that analytics is filtering down to the masses, just as it has been for a long time and just as it should to mean anything to a business. It means that SAP, for example, is partnering with Macromedia to make analytics so sexy and alluring that pie charts will spin into position in saturated four-color Flash rendition.

Next Page: Its not the amount of information you collect; its the conclusions you draw from it.