Social media networks are growing at an astounding rate. Facebook reportedly has passed 160 million users worldwide. Other social networks are growing at fast rates too. We are surrounded by fun and useful Web 2.0 technologies that help us collaborate and create our own content. Unfortunately, this also means we face escalating security vulnerability risks. There has recently been an unsettling increase in the amount of malware on the Internet.
The very architecture for Web 2.0 tools that allow for greater interactivity also open up new venues for computers and networks to be attacked by malware.
Organizations are now using Web 2.0-based solutions and social media networks in their workplace. There are many companies now who have eight or more of these applications in use on their networks. This trend of people using more Web 2.0 applications at work and at home has increased malware attacks and corporate data leaks-and the costs to repair them.
Most collaborative and interactive Web applications require code to run inside a user’s browser. Online scripts using Flash and JavaScript are becoming part of the Internet user’s everyday life. Web vendors only need to look at the successes of Google Docs, Facebook and YouTube to see the value in embedded programming running inside a browser. As would be expected, this process continues to accelerate as processes and applications follow documents and other files into the Internet cloud.
The problem in all of this is that code can be easily manipulated to allow entry into computers or networks. With so much of the Web now using code run in the browser to function, you cannot really just turn the scripts off and still enjoy the utility of the Web. Browser is the new operating system. The escalating functionality of what users can do within their browsers means there is also an increasing number of ways that malware can enter computers and networks-as a house becomes a mansion and it has more windows to see out, there are also more ways for thieves to break in.
Where once Internet users had to beware of clicking suspicious links in e-mail or downloading unknown programs, malicious programs can now come in many more forms. They do not always require mistaken consent to infect a computer. Malicious code has been found operating in advertisements running on Flash, rich HTML in e-mails and in many forms of JavaScript functions.
To combat these types of threats, the industry is moving from signature-based anti-malware to behavior-based approaches. Let’s explore these in detail.
Signature-Based Anti-Malware Approaches
Signature-based anti-malware approaches
Signature-based detection is a malware detection approach that identifies a malware instance by the presence of at least one byte code pattern present in a database of signatures from known malicious programs. If a program contains a pattern that exists within the database, it is deemed malicious. This approach to malware detection is also called signature-based or misuse detection. Although signature-based detection is currently the most commonly used technique for malware detection, it has the following two disadvantages (especially in a Web 2.0 environment):
Disadvantage No. 1: Susceptible to evasion
Since the signatures or patterns are derived from known malware, these detection schemes can be easily evaded by using program obfuscation such as packing and junk insertion. Even simple program obfuscations (such as inserting no-ops and code re-ordering) can create malware variants that can evade signature-based detectors. There is also strong evidence that hackers are already using these obfuscations to evade signature-based detectors.
Disadvantage No. 2: Cannot detect unknown malware
Since the signatures are constructed by examining known malware, signature-based detection can only detect “known malware.” In fact, signature-based detection is unable to even detect variants of known malware. Therefore, signature-based detectors provide very limited zero-day protection. Moreover, since a signature-based detector has to use a separate signature for each malware variant, the database of signatures also grows at an exponential rate.
Whitelisting: beyond signature-based
Signature-based approaches are no longer enough. But what are the alternatives? Let’s evaluate whitelisting and several types of behavior-based anti-malware.
Whitelisting is popular way for people to actively manage the software that is installed on their computer. Whitelisting software tools only permit approved software to install and run. Software products that are not explicitly on the control list lock down the computer. Whitelisting is a very promising way to protect computers, but it also creates a very rigid environment where rules about what software can be downloaded are strict.
But whitelisting detection has three shortcomings. First, it can create an annoying computer experience. Users are subjected to pop-up warnings constantly. Second, whitelisting limits users’ ability to easily download and use new software. And third, whitelisted applications can be vulnerable. For example, if you whitelist a browser, then any malware that operates inside the browser will not be detected. In fact, a lot of malware inject themselves into the browser.
Behavior-Based Anti-Malware Approaches
Behavior-based anti-malware approaches
Behavior-based approaches to malware detection monitor behaviors of a program to determine whether it is malicious or not. The behavior of a program that is typically monitored is the stream of system calls that the program issues to the operating system. Since behavior-based techniques monitor what a program does, they are not susceptible to the shortcomings of signature-based detection discussed earlier. Simply put, a behavior-based detector determines whether a program is malicious by inspecting what it does rather than what it says.
It is clear that the industry needs to move beyond signature-based detection. But how that will happen is still very much a debate. Several types of behavior-based detections exist.
Anomaly detection
One major approach to behavior-based detection is anomaly detection. In this approach to malware detection, a profile of normal program behavior is constructed. Any deviations from that profile are flagged as anomalous and thus suspicious. Anomaly detection is analogous to credit card fraud detection. Credit card companies maintain “spending profiles” for their customers. Any significant deviation from these profiles is flagged as suspicious.
For example, if a credit card company notices a large expense in a shop in Europe, and the customer has not shopped in Europe in the last few years, they will flag that transaction as anomalous. Similarly, let’s say a program, during its normal execution, never writes to a certain sensitive directory. If the monitoring system notices writes to that sensitive directory from the program, the detection system will flag that behavior as anomalous. Anomaly detection has the following two shortcomings:
Shortcoming No. 1: It is susceptible to false positives
Normal behavior for complex programs is very complicated. For example, the set of behaviors of Internet Explorer are very complex. Therefore, it is very hard to construct a model of normal behavior of a complex program. An inadequate model of normal behavior can lead to false positives.
Shortcoming No. 2: It is susceptible to mimicry attacks
It has been demonstrated that anomaly detection-based techniques are susceptible to mimicry attacks. In a mimicry attack, an attacker transforms his attack into another equally-malicious attack, but the transformed attack is allowed by the model of normal execution of the program.
Specification-Based Monitoring
Specification-based monitoring
Specification-based monitoring is a type of behavior-based detection. In the specification-based approach to detection, all events from the program to the operating system are mediated by a specification or policy. The policy dictates what action should be taken for a sequence of events. Typically, the actions are allow, deny or log.
For example, we might have a policy for a browser which states that “any files downloaded from a Web site (not on a whitelist) cannot be automatically executed.” This policy will not allow a user to download files from a Web site which are not on a whitelist and execute them. These kinds of policies can be very effective in addressing important infection vectors such as drive-by-downloads.
Specification-based monitoring has the following two advantages over anomaly detection.
Advantage No. 1: It has flexibility
Specification-based monitoring decouples policy construction from enforcement. For example, one can imagine having a policy in a specification-based monitoring system that is derived using anomaly detection. Therefore, in an abstract sense, specification-based monitoring is more general than anomaly detection.
Advantage No. 2: It has low false positives
Since policies in a well-engineered, specification-based monitoring system can be easily tuned, it can result in very low false positives.
Dr. Somesh Jha is co-founder and chief scientist of NovaShield. He has more than 19 years of research and development experience (both academic and industrial) in security and IT. He is currently a member of the faculty at the Department of Computer Science at the University of Wisconsin-Madison. He focuses primarily on computer system security and is a frequent speaker at security-related conferences and events across the United States.
Dr. Jha has been recognized through numerous awards including the NSF CAREER award, ACM SIGSOFT distinguished paper award and best paper award at ACSAC. He also conducted four years of advanced research during a postdoctoral fellowship at Carnegie Mellon University’s Computer Emergency Response Team (CERT). He also serves on the editorial board of the Journal of Computer Security, and on the program committees for WORM05, RAID05, USNIX Security Symposium and WWW (Security and Privacy Track).
Dr. Jha completed his PhD in Computer Science at Carnegie Mellon University and B.Tech in Electrical Engineering from IIT-Delhi, India. He can be reached jha@cs.wisc.edu.