Faults in Measuring Security Effectiveness

Astute buyers of IDS/IPS technology ask the critical question: "How can I tell if product X is really better than product Y for my purposes?"

By : Christian Stankevitz, CTO NSS Labs

The challenge that faced IT implementers throughout the history of intrusion prevention has been determining which IDS/IPS products are appropriate for a given environment. According to Christian Stankevitz, CTO of NSS Labs, every vendor claims that their product is better than the next in some material way.

The dizzying array of product features, numbers, anecdotal quotes, case studies and claims has left many buyers confused. Astute buyers of IDS/IPS technology ask the critical question: "How can I tell if product X is really better than product Y for my purposes?"

Independent testing should alleviate this confusion, assist with consumer product selection and augment vendor product improvement. That means that independent testers provide accurate, realistic metrics to IT decision makers.

Accurately Measure

The only way to guarantee an accurate measurement of security effectiveness is to perform real-world testing, utilizing live exploits that attack actual hosts. In the early days of security product testing, this was performed by exploiting a small number of vulnerable devices that would compromise the target when executed from "attacker" hosts. This approach worked quite well for a while, but failed to scale as the combination of exploits and vulnerable targets exploded. Every time a system was exploited, it had to be rebuilt.

To provide sufficient testing of coverage without building an expensive infrastructure of exploits and vulnerable applications, testers began using "capture/replay" tools, an approach which records an exploit once and sends - or replays - it repeatedly. Testing products using capture/replay has become the mainstay of IDS/IPS security testing and so widely accepted that many vendors and independent test labs use these tools exclusively.

The Capture/Replay Problem

Unfortunately, using capture/replay technology for the purpose of measuring security effectiveness is a flawed approach. It is incapable of determining the real impact of partially mitigated exploits and in many cases the captures fail to include meaningful payloads. Often, meaningful exploit payloads are the difference between a functional exploit and harmless garbage. This presents a problem for all IDS/IPS products, as both sophisticated "protocol decoding" technology as well as some simpler exploit-based signatures can correctly determine that the replay (with no payload) is not a real attack.

Thus, it is impossible to determine which IDS/IPS device is functioning properly in a test and which is not since neither report the attack. Furthermore, the results of capture/replay tests are indeterminate because such tools cannot target real vulnerable servers to ascertain the impact of an attack. Just because an attack was partially blocked by an IPS doesn't mean that the targeted service didn't crash. Understanding the ability to mitigate the impact of the real threat is the only true measure of security effectiveness.

The Product Effect

Unfortunately, since 2005 nearly everyone has relied heavily on a relatively small number of capture/replay tools, while market dynamics emphasized speed and simplicity over detection accuracy. During 2007, several IDS/IPS engines have been observed to miss or "leak" exploits that were once caught, even though the IDS/IPS purportedly had the necessary signatures. The problem was not limited to old exploits. Some recent, well-publicized exploits were discovered to neither be detected nor mitigated by security products that purported to have coverage for these threats.

Seeing an isolated exploit sneak past an IDS/IPS is nothing new, but it has become relatively common to see double-digit percentage increases in exploits passing through several products from multiple vendors. It appears that some security product vendors have adapted their signatures to detect the traffic generated by popular capture/replay tools. Thus, these vendors have in effect "de-tuned" their signatures and degraded their protective capabilities – in order to pass common lab test procedures.


It is time for a complete overhaul of IDS/IPS testing methodology. Capture/replay technology is of marginal value in a testing environment and must once again be replaced with real exploits targeting real vulnerable targets. Testing labs should provide environments with a large range of live host targets, complete with vulnerable applications, databases and operating systems. Live threats traversing live networks carrying live malicious code need to be used to measure actual product effectiveness and the real impact to these vulnerable systems. Real-world threat categorization and real-world testing methodologies must be implemented to get real-world results.

It is the responsibility of testing and certification organizations to provide this information in a manner which consumers can understand and apply to their specific needs.

Christian Stankevitz is the CTO of NSS Labs, the globally recognized leader in independent security performance testing and certification.