What do we do about malware? The long-term solution, at least for managed networks such as enterprises, may be whitelisting. But in the meantime, we’re still drowning in new variants every day. In the 2009 generation of its products, Symantec is trying a new approach: file reputation. It’s a little early to tell if it works well enough, but it seems to have potential.
The classic methods of malware scanning are generally agreed to be unsustainable. It’s not feasible for anti-malware companies to have a signature for every new variant, and yet we should expect the products to work even the first time a file appears on a customer’s system. For this reason heuristics are employed, but they have limits.
There are the behavior-blocking kind, where an IPS (intrusion prevention system) looks for potentially malicious behavior of running software and blocks it; this means that the malware is already running on the system, and even if your IPS blocks it, you have to be suspicious of what happened before that. Plus, IPSes have some potential for false positives.
True heuristics, where the file is scanned for potentially malicious characteristics before loading, are even more susceptible to false positives. There’s a role for such analysis, but attempts to build heuristic products entirely without malware have been failures.
The Norton 2009 products use all of these techniques and more. The company has added a form of whitelisting; in addition to signatures of bad files, they have signatures of good files, ones known to be good and therefore do not need to be scanned for malware. The average Windows system has quite a few of these, including Windows system files and files from well-known and trusted applications such as Office. These files don’t need to be scanned for malware, but they do need to be verified (Symantec uses an SHA256 hash) as being the files in the white list.
Only the Unknown Files Get Scanned
As a result, only unknown files need to be scanned, but as I said there are logical limits to that. Not finding malware in a file doesn’t prove it’s not malware. So now Symantec has added the concept of file reputation.
Imagine an asymptotic curve representing the prevalence of a particular file in the world. Well-known and legitimate files, like EXCEL.EXE, will be on the high side of the curve on the left. Further down are less-common files and the very popular malware. Further down, on what Symantec calls the “long tail” of the curve, are files that appear very infrequently. They may be legitimate, or they may be a new strain of malware with only a dozen copies in the world. The vast majority of new malware instances are on this part of the curve.
To get reputation data on files as users get them, Symantec gathers data from its customers who have opted in to what it calls the Norton Community Watch to send extra data back to the company. Symantec’s not especially forthcoming about the algorithm for calculating reputation, but it’s not hard to figure out some of the factors that would go into it.
Let’s say a new file goes onto a system or a new Web site is visited. If no other checks, like blacklists, come up with anything and there is no existing reputation on the file, then an SHA256 is calculated, and it and some other data is uploaded to the Norton cloud. Here, according to Symantec, if the file is not in the database already, mysterious statistical formulae work their magic and emit a reputation number. The initial analysis may also include a more detailed heuristic analysis than is done on the PC. As other things happen on the systems on which these files exist, the reputation may enhance or suffer as a result.
This reputation system has actually been in effect for a year, and Symantec says 18 million users have it in their products. That’s a pretty big sample. Does it work? That’s something for testing to decide, and whatever their other problems, Symantec malware detection is always first-rate in any reputable test I’ve seen.
Symantec’s approach makes sense to me, but it makes me shudder over the complication of it all. So many ways to check files, so many different inputs into the process. A simple solution like absolute whitelisting would be appealing, were it possible.
Security CenterEditor Larry Seltzer has worked in and written about the computer industry since 1983.
For insights on security coverage around the Web, take a look at eWEEK.com Security Center Editor Larry Seltzer’s blog Cheap Hack.