Malware Detection Goes Hybrid - Page 2

As a result, only unknown files need to be scanned, but as I said there are logical limits to that. Not finding malware in a file doesn't prove it's not malware. So now Symantec has added the concept of file reputation.

Imagine an asymptotic curve representing the prevalence of a particular file in the world. Well-known and legitimate files, like EXCEL.EXE, will be on the high side of the curve on the left. Further down are less-common files and the very popular malware. Further down, on what Symantec calls the "long tail" of the curve, are files that appear very infrequently. They may be legitimate, or they may be a new strain of malware with only a dozen copies in the world. The vast majority of new malware instances are on this part of the curve.

To get reputation data on files as users get them, Symantec gathers data from its customers who have opted in to what it calls the Norton Community Watch to send extra data back to the company. Symantec's not especially forthcoming about the algorithm for calculating reputation, but it's not hard to figure out some of the factors that would go into it.

Let's say a new file goes onto a system or a new Web site is visited. If no other checks, like blacklists, come up with anything and there is no existing reputation on the file, then an SHA256 is calculated, and it and some other data is uploaded to the Norton cloud. Here, according to Symantec, if the file is not in the database already, mysterious statistical formulae work their magic and emit a reputation number. The initial analysis may also include a more detailed heuristic analysis than is done on the PC. As other things happen on the systems on which these files exist, the reputation may enhance or suffer as a result.

This reputation system has actually been in effect for a year, and Symantec says 18 million users have it in their products. That's a pretty big sample. Does it work? That's something for testing to decide, and whatever their other problems, Symantec malware detection is always first-rate in any reputable test I've seen.

Symantec's approach makes sense to me, but it makes me shudder over the complication of it all. So many ways to check files, so many different inputs into the process. A simple solution like absolute whitelisting would be appealing, were it possible.

Security CenterEditor Larry Seltzer has worked in and written about the computer industry since 1983.

For insights on security coverage around the Web, take a look at Security Center Editor Larry Seltzer's blog Cheap Hack.