Data loss prevention doesn't have to be an "all-up" approach. Sometimes, it's best to start with the simple things.
Data loss prevention isn't a new idea, but it's a concept
that's become increasingly important to IT as organizations recognize the
threat to their operations from leaks by disgruntled insiders or intrusions by hostile
outsiders. At least, that's the pitch DLP (data-loss-prevention)
But in most cases, notes Securosis analyst and CEO Rich
Mogull, organizations that deploy DLP find data
leaks are more likely to be caused by accident or by bad procedures rather than
malice, As he explained in an interview with eWEEK, when the causes of leaks
are explored, the "whoops" factor surfaces repeatedly. Someone transmits
unencrypted medical data in violation of HIPAA (Health Insurance Portability
and Accountability Act) rules, or a file containing credit card numbers is
moved into an unsecured area. This sort of thing, when discovered during an
audit, can be a career-killer; if it becomes a news story, it's damaging to the
reputation of the business itself.
Of course, that doesn't mean that companies not subject to
regulatory regimes-such as HIPAA, Sarbanes-Oxley Act or others-can simply pass
on implementing DLP. As Mogull explained,
the risk of data loss isn't always visible: When data is stolen, or merely
mishandled, "you don't even have the base monitoring to know about the
How can an organization introduce DLP
in an effective manner, when the potential for leakage or loss is so pervasive?
Let's start with a conceptual discussion before moving on to specifics.
One can begin thinking about DLP
by treating data as being in one or more states: data in motion, data at rest
and data in use. But there's a danger in focusing on only one of these aspects,
because methods that work exceedingly well on, say, the network-or data in
motion-may be of little or no use against threats that seek to obtain data in
use at an endpoint. A sound DLP strategy
will consider all three of these against the needs of an organization, whether
these are regulatory, operational or cultural. The poser, for both IT managers
and security specialists, is that no single product adequately addresses all
Or if one chooses to look at DLP
from another perspective, one can consider it from the standpoint of threat
vectors. In this view, the tripod's legs are email, the Web and the endpoint. Protecting
against the first two is fairly well understood and easily implemented. The
third is a little more complicated. Removable media can be blocked or screened,
but the near-ubiquity of phone-based cameras makes it possible to record
on-screen data, albeit in a clumsy and terribly obvious fashion.
The first step in implementing a DLP
strategy is data identification. Although it may be easy to specify the general
nature of the data to be protected, such as financial records, customer
information or product plans, it's not always that simple to assign a risk
value to an individual document. Mogull points out that one has to "understand
what to protect."
Context + Content
Perhaps it's best to regard the context of data, with its
content, as two sides of a coin. Context can take the form of file metadata, email
headers or the application that's consuming the data. In more complex forms of
context analysis, a DLP process might look
at file formats or network protocols, or use network information from a DHCP
server and a directory service to identify who's consuming the data. This can
be expanded to take specific Web services or network destinations into account
or identify individual storage devices such as a USB
Content, as one might think, is pretty self-explanatory.
Being aware of the contents of data can often give one a good indication of
what kind of protection needs to be applied. Analyzing content is where things
get tricky, because one has to start with the context of data, and then examine
the contents. This might take a rules-based approach using regular expressions,
file matching, database fingerprinting or statistical analysis. This "content
awareness," as Securosis' Mogull puts it, is what defines true DLP.
Further stages of DLP Implementation
are where things can become complicated. For example, addressing DLP
in email seems relatively straightforward, because of the nature of the medium.
Many products that address other email security threats also offer some DLP
functionality-in a fashion that Mogull refers to as "DLP
Light"-and if one chooses to bring a dedicated system to insert another mail
transfer agent that provides a DLP layer,
it's unlikely to be noticed by users. The downside to such a solution is that
it may cover one's external email traffic well, but leave internal traffic
A similar situation can be seen with network-based DLP.
DLP products will often work with existing
reverse proxy features of an Internet gateway to inspect SSL-encrypted traffic.
In a recent report from Palo Alto Networks, sampled organizations in the United
States showed that 20.7 percent of bandwidth
consisted of SSL, on port 443 or other ports. The same traffic analysis showed
that one or more implementations of the Tor onion router were running on 15 percent
of survey networks worldwide. The most a DLP
solution can do for such traffic is to flag it or block it altogether, without
actually identifying what the traffic consists of.
In storage, Mogull and Securosis are observing less "DLP
Light" but better integration, thanks in part to the ability to tap into
databases and document management systems. Because of the nature of these
systems, real-time DLP monitoring is often
limited to filterlike techniques-categories, patterns and rules-because
anything deeper can present an obstacle to achieving optimal system
Whatever one chooses as part of a DLP
strategy, it's important to make sure that it offers a clean user interface and
solid reporting tools. Although those may seem like obvious criteria, Mogull observed
that DLP tools are sometimes so
engineering-driven that their designers forget that the users of the tools- who
may not work in IT at all-need a simple and efficient way to address potential
problems. After all, there will be occasions when an immediate response is
needed, and the interface should be a help rather than a hindrance.
An area of DLP that isn't
often discussed is what to do when data appears to have been leaked. All too
often, these efforts, which are reactive by their very nature, take on the
aspects of a witch hunt. These can often do more damage than the actual data
loss, by virtue of their effect on the morale of the organization and its
customers and partners. That's why it's important to keep Mogull's point about
intent in mind, or to paraphrase a common saying, "don't assume malice when simple
carelessness will suffice."
One thing that should give DLP
implementers hope, according to Mogull, is that the market is starting to
mature, even as the technology remains ahead of adoption. Arguably, the hardest
thing for IT and security managers to cope with today is making room in their
budgets for tools that are appropriate for their organizations, whether that's
viewed from a threat perspective or from the available skill sets within the
company. Of course, that's one problem that almost never goes away. At least,
until it's too late.
P. J. Connolly began writing for IT publications in 1997 and has a lengthy track record in both news and reviews. Since then, he's built two test labs from scratch and earned a reputation as the nicest skeptic you'll ever meet. Before taking up journalism, P. J. was an IT manager and consultant in San Francisco with a knack for networking the Apple Macintosh, and his love for technology is exceeded only by his contempt for the flavor of the month. Speaking of which, you can follow P. J. on Twitter at pjc415, or drop him an email at email@example.com.