There are a lot of people the U.S. government doesnt want to let into the country.
The problem is, how does the government compare notes with, say, a cruise line, to ensure that suspected terrorists and the like dont get on board? More specifically, how does it share such data without allowing it to fall into the wrong hands?
Vendors such as IBM are exploring the issue, which is a hot-button topic. After all, whenever you talk about collecting sensitive personal information nowadays, you run the risk of exposing data.
Not only does that feed the rash of identity theft weve been suffering under ever more acutely since the rise of the Internet, it also gives rise to scenarios spawned by the governments aforementioned need to share data with, for example, cruise lines, airlines, power plants and the like.
One such scenario was pointed out recently to me by Jeff Jonas, IBM distinguished engineer and chief scientist for IBM Entity Analytics. Jonas and Harriet Pearson, chief privacy officer for IBM, took part recently in IBMs first-ever podcast on information privacy.
Jonas got on the phone with me to discuss what IBMs doing with technology acquisitions and original research to tackle this issue, and in our chat he pointed to an example of how data sharing can go badly awry.
Soon after Sept. 11, when the FBI was sending around its terrorist watch list—a list of people the FBI wanted to talk to and not necessarily terrorists per se—it wound up in a score of unlikely places, as the Wall Street Journal reported, here in PDF form, in 2002. Those places included car rental companies, casinos, trucking companies, power plant operators and chemical plants.
The data spread far and wide—far beyond the reach of the FBI to keep it current or correct, and most certainly in front of the eyes of a population that had no business having access to the sensitive information.
Thats not good. Obviously, the government doesnt want the wrong people to see these lists. If a terrorist sees his name is on a list, its useful information, and he can easily change his name.
Then, of course, there are the innocent bystanders whose civil liberties, privacy and identity are violated, including the three Atta brothers in Texas, unconnected to the alleged hijacking leader, exonerated but having to chase down their names to get them off watch list copies posted on Internet sites in at least five countries.
Likewise, cruise lines dont want to send their passenger lists up to Washington every day, either. So how do you compare data sets to come up with matches without actually exposing all your sensitive data?
Even more to the point for enterprises is a related question posed by Pearson during the podcast: How do you share enough information to keep business (or the economy) flowing, while balancing privacy and security of personally identifiable information?
After all, whether youre visiting the doctor, heading to the Department of Motor Vehicles or buying a book on Amazon, youre exposing sensitive data.
And as Jonas said in the podcast, consumers really hate being surprised in this area.
“When the consumer suddenly realizes that their data is flowing in a way that they had not anticipated or its been revealed in a way that they would never have expected, then with that comes serious consequences,” he said. “… Consequences that affect companies brands when theyre surprising consumers.”
As it is, companies are already sharing a great quantity of data, and the quantity is only going to increase. But at what price? Such data originates in a system of record. From there it gets repackaged, integrated with other data and perhaps shared again. What you wind up with is what Jonas refers to as cascading, or a waterfall of data, where its next to impossible to keep data tethered and current.
Throw government involvement into the mix, and issues around privacy and expression of private data fast turns into civil liberties dilemmas.
Tips for Corporations
Jonas had some good tips for corporations when it comes to forging a strategy for handling personal data, and Im passing them on here:
- Do an inventory: know what data you are using and managing and where it is going.
- Get senior people to formulate a vision of what kind of company you are, how you want to market and how you want to be respected.
- Determine what laws you have to comply with.
- Keep your strategy to yourselves—dont tell customers or clients until you figure out whether youre in position to execute quickly to close the gap on your desired state and your current reality.
- When you have a good idea of what you need to do to get to where you want to be, you can share the vision with customers and clients.
- Work to close the gaps.
When it comes to privacy/security policy goals, anonymizing data is a good one to work toward. IBM has been working on new technologies that allow for deep correlation on data while it remains in an encrypted or anonymized form.
The new technique amounts to comparing data after its been shredded. What it results in is the ability of two entities to compare two sets of records without the data ever being unencrypted. The product thats come out of this work, DB2 Anonymous Resolution, was released last May.
It wont tell a government that John Smith is coming into port on the QEII, nor will it tell a cruise line that a suspected terrorist by the name of Billy the Kid is onboard.
What it will do is point them to matches that each respective party will have to look up for themselves to find that Record 1, 2 and 3 correlate with the data sets to which theyre comparing their own watch lists.
It does so without exposing phone numbers, credit card numbers, names or anything beyond pointers to which records the given entities have in common.
The product is in use now by one U.S. government entity and one foreign government entity.
It sounds like promising technology, and thats good, because in order to guard against horrific events such as 9/11, we need to get governments and corporations swapping information.
Heck, we need to get government able to swap data with itself. As Jonas pointed out in our chat, most people would find it shocking that in one government building, you can walk out the corridor, head down three doors and find a system that isnt connected to the system where you started out.
“If one group is working on money laundering, and three doors down another group is working on anti-drug efforts, each system has its own set of secrets,” he said.
They dont know when they have three people in common. Of course, theyre people. They talk to each other. They can pick up the phone and run down their lists. But its kind of like go fish. Its highly unproductive to read a whole list to you.
In more quotidian terms, businesses need to compare data sets without exposing sensitive information, as well. Think of retailers who are plagued by shoplifting rings. Think of ChoicePoint and the data verification services it provides.
Then think of the record-setting fines imposed on ChoicePoint for its breach of 163,000 consumers personal financial data: $15 million in fines to the FTC, including the largest civil penalty in FTC history, along with 20 years of independent security audits every other year.
Yes, with consequences like that, we definitely need some new ideas for end-to-end encryption, so kudos to IBM for going there.
Lisa Vaas is eWEEKs news editor in charge of operations. She is also the editor of the Database and Business Intelligence topic center. She has been with eWEEK since 1995, most recently covering enterprise applications and database technology. She can be reached at lisa_vaas[email protected]
Check out eWEEK.coms for the latest database news, reviews and analysis.