Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home Database
    • Database
    • IT Management
    • Storage

    How to Accelerate and Streamline Data Classification Projects

    Written by

    Raphael Reich
    Published January 18, 2010
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Organizations can quickly become overwhelmed with managing and protecting all of the unstructured data in their possession. Unstructured data includes all of the documents, spreadsheets, presentations and more that are stored on shared file servers, network-attached storage (NAS) devices, SharePoint sites, etc. It accounts for roughly 80 percent of business data. In addition to being the majority of business data, unstructured data grows in excess of 50 percent per year, making it hard to keep pace with this key business resource.

      To deal with unstructured data, many organizations initiate data classification projects in the hopes of identifying their most sensitive data, fixing any problems and implementing proper controls. Regrettably, there are both business and technical challenges that prevent data classification deployments from reaching their full potential.

      From a business perspective, a lack of actionable results is the primary challenge. Data classification solutions produce a list of files with sensitive content, but the question of what the files mean to the business and what to do with them is not inherently obvious. On the technical side, the issue is that data classification solutions scan every file looking for relevant content and are, consequently, slow to deliver results. And on subsequent searches, these solutions must look at all files again, making it virtually impossible to keep pace with data growth and change.

      The following are five measures that organizations can take to accelerate the pace of producing actionable data classification results:

      Measure No. 1: Determine who owns the data

      Data owners are a critical component to managing unstructured data. They understand the importance of data assets to the business and are, therefore, integral to the process of classifying this data. They can help determine who should and should not have access, what type of protections the data should have, and point out when the data is no longer relevant to the business. When it comes to sensitive data, owners can help determine whether data is at risk and what remediation steps are required.

      Identifying owners is not easy to do though. The locations of data and the names of data folders, directories or sites often provide little indication of true data ownership, and file system metadata about data ownership goes stale quickly. The most common methods for identifying data owners-phone calls and e-mail messages-are not efficient or effective processes.

      The best way to track data owners is to have an automated, repeatable process in place. One of the most effective ways to determine data owners is to track who is accessing the data. Over time, the top users of data will become obvious and these users will be able to tell organizations who own the data.

      Document What Data Is Of Interest

      Measure No. 2: Document what data is of interest

      Documenting the key words, phrases and patterns that are of interest to a business requires both investigative work and an understanding of what’s driving the need to find data. The natural starting point is to work with data owners and security and risk managers to identify and document what data is of interest to an organization. In many organizations, regulatory compliance is a driver. Regulations often specify which data is sensitive and what measures are required to protect it. Intellectual property (IP), customer data and employee information are other common types of information requiring special attention.

      Establishing different levels of sensitivity that are based on the type of content your organization needs to manage and protect will help provide additional structure to this task. Industry best practices show that a good rule of thumb is to constrain an organization’s hierarchy to four levels. More than that and it becomes difficult and impractical to manage. Examples of four levels to begin with can include Secret data, Confidential data, Private data and Public data.

      Measure No. 3: Focus and accelerate with metadata

      Metadata-data about your data such as file sizes, types and locations-should be used to focus and accelerate your data classification projects. Metadata adds another dimension to the search process, effectively providing a shortlist of where to look and what to expect.

      For example, if you want to identify credit card data that is at-risk, you can use permissions metadata to find files that are accessible by too many people. You can then look inside those files for credit card data. In fact, any sensitive data found in overly-accessible files has a clear remediation path: fix the access permissions to the data so that it is based on least-privilege (that is, business need-to-know). The following are examples of metadata and how it can be used to focus and accelerate data classification:

      1. Data access permissions

      A careful analysis of file, folder and site permissions will tell organizations who can access their sensitive data and which data is overly-accessible.

      2. Data access activity

      Data access activity provides important information such as which folders are the most frequently used and which folders are not being used at all. It also indicates which data was recently added or modified. That intelligence is tremendously useful, for example, in reducing the time spent searching. After the initial classification scan has occurred, subsequent searches can be restricted to just that data which needs to be classified (that is, the data that has not yet been searched). For specific users or groups, organizations can determine what data they have been accessing to see who has actually been using the sensitive data.

      3. Data ownership

      Data ownership information helps limit searches to data owned by specific people. So, if organizations are working with individuals to help them get control over their sensitive data, this piece of metadata will narrow sensitive data searches to just the relevant data.

      Communicate and Remediate

      Measure No. 4: Communicate and remediate

      Finding sensitive data is obviously an important part of classification projects but it’s not the final stage. After obtaining results, organizations need to get it into the hands of decision makers-which are typically data owners and Governance, Risk Management, and Compliance (GRC) teams-so that these people can understand the situation and begin formulating remediation strategies and plans.

      Data owners are typically in the best position to identify exactly what the content is, whether the data is stored in the right place, and who should and should not have access to it. They can also help build a remediation strategy and process, especially once they are armed with specific examples involving their own data. GRC staff can provide the overall oversight needed to ensure that data is being protected in accordance with the organization’s objectives. And, these teams can use result reports as the basis of documentation for audit requirements.

      Measure No. 5: Regularly recheck data

      Businesses should establish a process of periodically rechecking data to ensure an accurate view of sensitive data. Data is constantly growing and changing, thus there is a need to do so. Ideally, organizations should limit searches to newly-added data to determine if it contains sensitive information and to existing data that has been modified to determine if it has either gained or lost relevance to classification projects. Organizations should provide data owners and GRC staff with updated intelligence based on rescanning.

      Final thoughts

      To find the important data among all an organization’s unstructured data, a data classification solution is needed because there is simply too much data to process and keep pace with manually. While there are many solutions to choose from, a solution that leverages the power of metadata is critical for achieving actionable results. Without metadata, data classification projects can take far too long, and the results they produce typically don’t have the context required to remediate problems. Metadata can dramatically cut the time it takes to produce results and can help provide the context required for problem remediation.

      Raphael Reich is Senior Director of Marketing at Varonis Systems. Raphael brings over 16 years of product marketing and management experience to Varonis. Prior to joining Varonis, he held product marketing and management roles at Cisco, Check Point, Echelon and Network General. Raphael was also a software engineer at Digital Equipment Corporation. He holds a Bachelor’s degree in Computer Science from UC Santa Cruz and an MBA from UCLA. He can be reached at [email protected].

      Raphael Reich
      Raphael Reich
      Raphael Reich is Senior Director of Marketing at Varonis. Raphael brings over 16 years of product marketing and management experience to Varonis Systems. Prior to joining Varonis, he held product marketing and management roles at Cisco, Check Point, Echelon and Network General. Raphael was also a software engineer at Digital Equipment Corporation. He holds a Bachelor's degree in Computer Science from UC Santa Cruz and an MBA from UCLA. He can be reached at [email protected].

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.