Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Applications
    • Applications
    • Cybersecurity
    • Database

    Garbage In, Garbage Out of Control

    Written by

    Lisa Vaas
    Published June 15, 2005
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      I just read Baselinemag.coms excellent story on the rising threat posed by bad data thats stored in myriad databases across the land: registries of motor vehicles, insurance firms, marketing companies and other commercial sources, as well as public records such as court documents and licenses.

      My thoughts? Be afraid. Be very afraid.

      Know this: Personal identity information is full of inaccuracies, typos and outdated information that can, at the merely annoying end of the spectrum, plague innocent citizens as they get turned down for insurance or have credit applications denied.

      Things turn truly Orwellian, however, when you get into scenarios outlined by Baselinemag.com, in which innocent victims of dirty data suffer much more traumatically.

      Case in point: Steven Calderon got tossed into jail to rot for a week in January 2002 for felonies he didnt commit, including rape and child molestation.

      The problem? Police, and Calderons employer, Frys Electronics, believed data aggregated and supplied by ChoicePoint, rather than the evidence in front of their eyes, which would have told them that Calderon didnt match the perps height, weight, drivers license number or fingerprints.

      I wish I could tell you that this was a database problem and that technology vendors are all over the problem of cleansing this data.

      Ha!

      Granted, ETL (extraction, transformation and loading) vendors are all about fixing the mess that passes for data in these discrete databases. But whatever achievements we get from that camp will still leave us struggling fiercely against the tide when it comes to the urge to merge these soiled little buckets.

      /zimages/3/28571.gifClick here to read about data theft at MCI and its influence on the encryption debate.

      Because aggregation is happening all over the place, linking these databases together regardless of the power and range it gives to the propagation of dirty data.

      You get it on the technology front, of course, with admittedly splendid analytics applications coming from companies such as SAP.

      When I spoke recently with Roman Bukary, leader of SAPs xApps and Analytic Applications product marketing, he told me that this is what its all about: going from standard analytic reports to composite analytic applications.

      What does that mean? It means business users can initiate and take action on workflow applications inside analytic applications. In other words, with the upcoming merging of technologies such as Microsofts and SAPs in the Mendocino product, youll be able to be flitting around in Office and decide to give somebody a pay raise without having to leave to go fiddle with the SAP HR module.

      It means that analytics is filtering down to the masses, just as it has been for a long time and just as it should to mean anything to a business. It means that SAP, for example, is partnering with Macromedia to make analytics so sexy and alluring that pie charts will spin into position in saturated four-color Flash rendition.

      Next Page: Its not the amount of information you collect; its the conclusions you draw from it.

      Its Not the Amount


      of Information”>

      This is all great. I love the applications. When I saw Mendocino demonstrated at Sapphire, I wanted it. For what, who cares? It just looks like so much fun to play with all that raw enterprise power, all from the comfort of Office.

      But then I had a conversation with a computer scientist whom I met last week at IDCs forum on business intelligence, and I sobered up.

      Dietrich Falkenthal is interested in visualization technology, of which many companies besides SAP gave gorgeous demonstrations at IDCs gig.

      The key, Falkenthal said, is not the amount of information that can be collected from sensors, user inputs or other data sources, but how to make it useful, especially in tactical environments. Visualization technology is important to medical services, law enforcement and the military, for example, because they have a limited time to make decisions.

      But what cant be done with current technology is to come up with automated tools to intelligently handle complex real-time data. You can present data in gorgeous spinning pie charts, but if its the wrong data presented, the wrong conclusions can be reached, Falkenthal pointed out.

      “Tools are needed to process a lot of data and take some burden off users. Essentially, to do a smart push of important data that the user doesnt yet know he or she needs. For the most part, its still garbage in, garbage out, but visualization tools may help.”

      This is not data cleansing, where records are combed through to eliminate name-spelling variants, for example. This is about incorrect data correlations: something you most certainly dont want police, passport agents, medical professionals or anybody in the military to be acting on.

      /zimages/3/28571.gifClick here to read and download Baselines 7-step plan for cleansing your data.

      Research in this area is new, but Falkenthal pointed me to universities such as MITs Engineering Systems Division or to companies and research labs that are thinking about these issues.

      Meanwhile, though, technology is forging ahead, synching up data sources. To compound the problem, nutso legislation is being passed.

      The Real ID Act will usher in the nations first national ID system, with little regard for the governments ability to deploy the technology in ways that would prevent citizens from being preyed on by identity thieves and with no regard for that fact that it relies on data from sources, such as state RMVs, that are increasingly targets for identity theft. And which, of course, contain typos, outdated information, etc.

      The bill dictates that all states collect personal information from citizens before allowing them to obtain a drivers license, including—at minimum—name, date of birth, gender, drivers license or identification card number, digital photograph, address, and signature.

      Collection of this particular information is not new. Linkage of states databases is. The bill specifies that states link what are at present discrete databases, creating, in effect, one nationwide database with personal information pertaining to all citizens.

      Next Page: Dont trust companies to protect your data; thats a do-it-yourselfer.

      Dont Trust Companies to


      Protect Your Data”>

      ChoicePoint doesnt take responsibility for aggregating and propagating filthy data. ChoicePoint says its the data sources—RMVs, court, etc.—that are responsible for the data. If its from the government, it must be good stuff, the thinking goes.

      Do you trust the government to have the right information on you?

      Do you trust the government to protect your data from thieves?

      If you answered yes to either question, youre naive.

      Back when the Real ID Act was on the brink of passing, I chatted with Marc Rotenberg, executive director of the Electronic Privacy Information Center in Washington. He pointed out that the problem is not that database information cant be encrypted—its that the government has proven untrustworthy in doing so.

      Look at the metric of the FISMA—the Federal Information Security Management Act. Its legislation that mandates that government agencies be graded on their ability to protect data. The Department of Homeland Security has gotten four Fs in a row. If theyre not securing data, do we really want to trust state RMVs?

      Your information is already in these databases. Do you want it in one or two databases, or 50? Do you want every potentially crummy, unencrypted piece of data to be linked to every other potentially crummy, unencrypted piece of data?

      I know Im mixing the topics: weve got dirty data, and weve got unencrypted, unprotected data. But both problems wind up with the same result: people getting thrown into jail for other peoples crimes. People getting stopped at the airport because they have Arabic names that look like terrorists. Innocent people being unfairly persecuted.

      Whats the answer? I wouldnt advise looking to technology to solve the problem. I would go back to the wise stance of paranoia and being a fierce watchdog over who gets your information and what they plan to do with it.

      /zimages/3/28571.gifTo read David Courseys “Anti-Phishing 101” column with tips on protecting personal data, click here.

      My favorite spot for how-tos in protecting the spread of personal information is Junkbusters. There, youll be told how to get companies to stop renting or sharing your name; how to get off lists sold by companies that profit off your information,—that means youll be corresponding with—oh, joy!—ChoicePoint, et al.; how to browse the Web without leaving a trail of personal information behind you in the form of cookies; and more.

      Is it easy? Oh, no. Believe me, Ive been through Junkbusters 12-step program for recovering personal data leakers. One little change in address, and presto! Youre back on the list of data leakage.

      But it is satisfying, deeply satisfying, to get your personal information as expunged as possible from as many of these dirty data buckets as possible, and I highly recommend it. I really like the idea that I hamper the profits of those who broker my personal information with no remuneration to myself, and who do so with casual disregard for propagating garbage.

      /zimages/3/28571.gifCheck out eWEEK.coms for the latest database news, reviews and analysis.

      Lisa Vaas is Ziff Davis Internets news editor in charge of operations. She is also the editor of eWEEK.coms Database and Business Intelligence topic center. She has been with eWEEK and eWEEK.com since 1995, most recently covering enterprise applications and database technology. She can be reached at lisa_vaas@ziffdavis.com.

      Lisa Vaas
      Lisa Vaas
      Lisa Vaas is News Editor/Operations for eWEEK.com and also serves as editor of the Database topic center. She has focused on customer relationship management technology, IT salaries and careers, effects of the H1-B visa on the technology workforce, wireless technology, security, and, most recently, databases and the technologies that touch upon them. Her articles have appeared in eWEEK's print edition, on eWEEK.com, and in the startup IT magazine PC Connection.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×