Getting the most out of mountains of log data can be trying to say the least.
In a conference where many are focused on defeating security, independent researcher Alexandre Pinto wants to find ways to make defending enterprise networks both smarter and easier. At the upcoming Black Hat conference in Las Vegas, Pinto plans to discuss how machine-learning algorithms can be used to help organizations get more value from their logs.
"The amount of security log data that is being accumulated today, be it for compliance or for incident response reasons, is bigger than ever," said Pinto. "Given a recent push on regulations such as PCI and HIPAA, even small and medium companies have a lot of data stored in log management solutions no one is looking at. So, there is a surplus of data and a shortage of professionals that are capable of analyzing this data and making sense of it."
SIEM (security information event management) functionality relies too much on very deterministic rules, he added. For example, a rule might state that if something happens in a network "X" amount of times, it should be flagged as suspicious. The problem is that the "somethings" and the "Xs" change between organizations and evolve over time, he said.
"But this is not exclusively a tool problem," he said. "I have seen really talented and experienced people be able to configure one of these systems to really perform well. But it usually takes a number of months or years and a couple of these SOC [security operations center] supermen to make this happen. I used to run teams like these in my previous position, and I understand the challenges involved."
After managing security consultants and security monitor teams for years, he began researching ways to improve the experience for analysts. His answer: machine learning.
"The [Black Hat] talk is about a model I created to help classify malicious behavior from log data and help companies make decisions based on this trove of information they have available," Pinto explained. "It does not outperform a well-trained analyst. But it can greatly enhance the analyst's productivity and effectiveness by letting him focus on the small percentage of data that is much more likely to be malicious based on previous happenings on the network."
Machine learning is designed to infer relationships from large amounts of data, he added. The more data, the better the predictions—making it a "good deal" for security, he said.