Xerox Demos Intelligent Redaction

It uses natural-language processing and security techniques to identify offending data.

Xerox is developing new software capabilities that could help automate the redaction process, making it easier, faster and more comprehensive than the current keyword-oriented products enterprises use to hide information for accessible documents.

The software, dubbed Intelligent Redaction, is still in the research phase, with no product availability or price currently announced, but was demonstrated Oct. 15 at Xerox Research Center Europes Technology Day in Grenoble, France. Developers from Palo Alto Research Center, a Xerox subsidiary, participated.

The redaction software combines PARC expertise in natural-language processing instead of keyword search, plus security techniques, to automate much of the process and create a more comprehensive search.


Click here to read more about digital printing from Xerox.

The detection tool uses content analysis to identify portions of the document that match the criteria for redacting, then generates a fresh, redacted document—currently a PDF—with the information either removed or hidden, using encryption. The document can then hide redacted information except to authenticated users and is capable of providing "selective redaction," or multiple user views. For example, an appraiser would need to see a property address, but not an applicants earnings, whereas a mortgage broker requires different information, PARC officials said. The viewing tool could one day be available as a plug-in.

The redacting user can set redaction rules to identify offending data and specify redaction be done on a word, phrase, sentence, paragraph or page level—e.g., if a paragraph contains sensitive information, the entire paragraph would be redacted. The system will highlight all portions of a document that meet a given redaction rule or setting, for the user to review. Collaborative sharing, allows users and groups to build a "dictionary" of commonly redacted data, said Jessica Staddon, area manager, Security and Privacy Research at PARC.

The health care, legal and financial services industries among others rely heavily on the technique to share information without breaching compliance regulations or leaking intellectual property. Health-care users may wish to excise information, whereas those in financial services may want "reversible" redaction, and multiple levels or views of redacted content, Staddon pointed out. "For example, an appraiser would need to see a property address, but not an applicants earnings, while some other party might need to see different information."


Check out eWEEK.coms for the latest printer news, reviews and analysis.