We’re all abundantly aware that in only the past few years, a whole new sub-category of IT has grown up around storing, analyzing, monitoring, protecting and reporting on so-called big data stores. These are large caches of data that include archived conventional business data along with new streams of social and machine data that weren’t taken into account previously.
We are now officially adding another new service category to that list: data governance for big data. After all, big data is of no use to anyone unless it is under tight control, especially in regulated industries. This new sector is being claimed early on by Fremont, Calif.-based Dataguise, which specializes in big data-centric control and access.
eWEEK has learned exclusively that on June 23, Dataguise will launch at the Gartner Security and Risk Management Summit in National Harbor, Md., what it describes as the industry’s first specifically architected solution for big data governance.
Discovers Data from Silos, Automates Governance
The Dataguise Data Governance suite enables enterprises to declare policies, discover sensitive data, redact required terms, view and track entitlements, and audit access to sensitive data–and then automate them all across transactional databases, data warehouses, file shares, Apache Hadoop and other big data stores.
For companies with large, spread-out IT systems (remote offices, etc.), trying to get everything under one roof to do effective analysis has been a headache and a half for a long time.
“What Dataguise can do is marry both the structured (database) and unstructured (file system) data in an IT system, analyze it better and figure out new business information about what people are buying, when they are buying, where and why fraud is happening and so on,” Dataguise co-founder and CEO Manmeet Singh told eWEEK.
Enterprises need to use everything–current database data, new unstructured data (email, the Internet, social networks, images, documents)–and archived data all at once in order to get an accurate picture of their business, Singh said.
“They need to kill their silos and bring in everything they can analyze,” Singh said. “They also need to protect their sensitive information. This we do with our masking technique.”
Automatically Redact Required Terms Within Documents
Data owners can use Dataguise (a play off “data disguise”) to protect information such as Social Security numbers, email addresses, credit card numbers, addresses and other information by actually having the software “black out” certain terms in files as needed–similar to what government agencies and law firms do when classified words are redacted from file documents.
This is Dataguise’s secret sauce. “When the data comes into a system, the policies have already been defined at a higher level,” Singh said. “Using these set policies, we can, in flight, do the masking or encrypting of the data as required. We go into each file–unlike any other vendor–do all the discovery of the 1 percent of the data that needs to be masked, and make available the 99 percent that a company can use to do its business.”
Dataguise supports a range of platforms that include Oracle, IBM DB2, SQL Server, Teradata, Cloudera, Hortonworks, MapR, and Pivotal HD. The suite of functions works with DgSecure, Dataguise’s flagship platform for data privacy, protection and security for sensitive data across the enterprise.
Dataguise for Data Governance features include:
—Policy Quickstart: Select predefined policies for PCI, PII and HIPAA; or click-to-create custom policies with no coding or scripting;
—Sensitive Data Discovery: Automatically find and track sensitive data in the enterprise, whether at rest or in motion, in structured or unstructured format, across heterogeneous data platforms;
—Entitlements: View and track entitlements down to the user and data element level;
—Auditing: View automated reports and dashboards to track who accessed what sensitive data.
Dataguise, founded in 2007, has most of its customers in the financial services, health care, retail and government environments. Its products are designed to reduce the risk of data breaches and to remain compliant with regulations that protect personally identifiable information (PII), such as the Health Insurance Portability and Accountability Act (HIPAA), the Health Information Technology for Economic and Clinical Health (HITECH), and the Payment Card Industry (PCI) DSS.
For more information, go here.