Determine the data privacy protection strategy during the planning phase of a deployment, preferably before moving any data into Hadoop. This will prevent the possibility of damaging compliance exposure for the company and avoid unpredictability in the rollout schedule.
3Consider Privacy Concerns
Identify what data elements are defined as sensitive within your organization. Consider company privacy policies, pertinent industry regulations and governmental regulations.
4Check for Exposure
Determine the compliance exposure risk based on the information collected.
5Be Aware of Sensitive Data
Discover whether sensitive data is embedded in the environment, assembled or will be assembled in Hadoop.
6Real or Desensitized?
Determine whether business analytics needs require access to real data or if desensitized data can be used. Then, choose the right remediation technique—masking or encryption. If in doubt, remember that masking provides the most secure remediation while encryption provides the most flexibility, should future needs evolve.
7Support the Relevant Techniques
Ensure that the data protection solutions under consideration support both masking and encryption remediation techniques, especially if the goal is to keep both masked and unmasked versions of sensitive data in separate Hadoop directories.
8Be Consistent Across the Board
Ensure the data protection technology used implements consistent masking across all data files—Joe becomes Dave in all files—to preserve the accuracy of data analysis across all data aggregation dimensions.
9Tailored or Off the Rack?
Determine whether a tailored protection for specific data sets is required and consider dividing Hadoop directories into smaller groups where security can be managed as a unit.
10Make Sure Everything Fits
Ensure the selected encryption solution interoperates with the company’s access-control technology and that both allow users with different credentials to have the appropriate, selective access to data in the Hadoop cluster.
11Make Decryption Available
Ensure that when encryption is required, the proper technology—Java, Pig, etc.—is deployed to allow for seamless decryption and ensure expedited access to data.