Measure No. 2: Document what data is of interest
Documenting the key words, phrases and patterns that are of interest to a business requires both investigative work and an understanding of what's driving the need to find data. The natural starting point is to work with data owners and security and risk managers to identify and document what data is of interest to an organization. In many organizations, regulatory compliance is a driver. Regulations often specify which data is sensitive and what measures are required to protect it. Intellectual property (IP), customer data and employee information are other common types of information requiring special attention.
Establishing different levels of sensitivity that are based on the type of content your organization needs to manage and protect will help provide additional structure to this task. Industry best practices show that a good rule of thumb is to constrain an organization's hierarchy to four levels. More than that and it becomes difficult and impractical to manage. Examples of four levels to begin with can include Secret data, Confidential data, Private data and Public data.
Measure No. 3: Focus and accelerate with metadata
Metadata-data about your data such as file sizes, types and locations-should be used to focus and accelerate your data classification projects. Metadata adds another dimension to the search process, effectively providing a shortlist of where to look and what to expect.
For example, if you want to identify credit card data that is at-risk, you can use permissions metadata to find files that are accessible by too many people. You can then look inside those files for credit card data. In fact, any sensitive data found in overly-accessible files has a clear remediation path: fix the access permissions to the data so that it is based on least-privilege (that is, business need-to-know). The following are examples of metadata and how it can be used to focus and accelerate data classification:
1. Data access permissions
A careful analysis of file, folder and site permissions will tell organizations who can access their sensitive data and which data is overly-accessible.
2. Data access activity
Data access activity provides important information such as which folders are the most frequently used and which folders are not being used at all. It also indicates which data was recently added or modified. That intelligence is tremendously useful, for example, in reducing the time spent searching. After the initial classification scan has occurred, subsequent searches can be restricted to just that data which needs to be classified (that is, the data that has not yet been searched). For specific users or groups, organizations can determine what data they have been accessing to see who has actually been using the sensitive data.
3. Data ownership
Data ownership information helps limit searches to data owned by specific people. So, if organizations are working with individuals to help them get control over their sensitive data, this piece of metadata will narrow sensitive data searches to just the relevant data.