How to Accelerate and Streamline Data Classification Projects (
Page 1 of 3 )
Organizations
can quickly become overwhelmed with managing and protecting all of the
unstructured data in their possession. Unstructured data includes all
of the documents, spreadsheets, presentations and more that are stored
on shared file servers, network-attached storage (NAS) devices,
SharePoint sites, etc. It accounts for roughly 80 percent of business
data. In addition to being the majority of business data, unstructured
data grows in excess of 50 percent per year, making it hard to keep
pace with this key business resource.
To deal with unstructured data,
many organizations initiate data classification projects in the hopes
of identifying their most sensitive data, fixing any problems and
implementing proper controls. Regrettably, there are both business and
technical challenges that prevent data classification deployments from
reaching their full potential.
From a business perspective, a lack
of actionable results is the primary challenge. Data classification
solutions produce a list of files with sensitive content, but the
question of what the files mean to the business and what to do with
them is not inherently obvious. On the technical side, the issue is
that data classification solutions scan every file looking for relevant
content and are, consequently, slow to deliver results. And on
subsequent searches, these solutions must look at all files again,
making it virtually impossible to keep pace with data growth and
change.
The following are five measures
that organizations can take to accelerate the pace of producing
actionable data classification results:
Measure No. 1: Determine who owns the data
Data owners are a critical
component to managing unstructured data. They understand the importance
of data assets to the business and are, therefore, integral to the
process of classifying this data. They can help determine who should
and should not have access, what type of protections the data should
have, and point out when the data is no longer relevant to the
business. When it comes to sensitive data, owners can help determine
whether data is at risk and what remediation steps are required.
Identifying owners is not easy to
do though. The locations of data and the names of data folders,
directories or sites often provide little indication of true data
ownership, and file system metadata about data ownership goes stale
quickly. The most common methods for identifying data owners—phone
calls and e-mail messages—are not efficient or effective processes.
The best way to track data owners
is to have an automated, repeatable process in place. One of the most
effective ways to determine data owners is to track who is accessing
the data. Over time, the top users of data will become obvious and
these users will be able to tell organizations who own the data.