Enterprises and their NetOps teams are challenged with sifting through large amounts of incoming data to identify technical, performance and security problems as they arise on the network. This is traditionally a manual, time-intensive process, meaning NetOps teams are prioritizing solutions that will help identify issues and fix them quickly – AIOps is one of those solutions.
AIOps uses artificial intelligence to find and understand patterns and identify anomalies within large, complex data sets. According to Gartner, “AIOps combines big data and machine learning to automate IT operations processes, including event correlation, anomaly detection and causality determination.”
While there is a lot that AIOps can do, recent research indicates enterprises are prioritizing use cases that help quickly identify potential network issues (such as anomaly detection/intelligent alerting and escalation), and fix them as fast as possible (such as automated remediation for security incidents and IT service problems).
To explore this topic further, let’s dive into some recent research from EMA that evaluates AIOps usage and perceptions, and look at how AIOps-driven approaches can benefit NetOps teams.
Research: Prioritizing Use Cases
When it comes to AIOps, EMA’s research shows companies are clearly prioritizing use cases that are directly focused on keeping the network operating securely and efficiently. For example, anomaly detection, which involves exposing unusual activity or operation outside of normal parameters, is being prioritized or implemented at 56% of enterprises, making it the top use case for AIOps. Which makes sense considering that anomalies may point to serious operational or security issues.
Furthermore, artificial intelligence (AI) can be trained to quickly distinguish anomalies that truly threaten network operations from those that don’t, helping teams to concentrate efforts where it’s needed most.
As an example, enterprises need to define policies to detect anomalies from the usual monthly trend when an unusual spike happens in bandwidth consumption which can be tracked and narrowed down to certain network services or applications, which could be known or unknown to the enterprise space. This usually happens during an unscheduled server or data backup, or a BW utilization of certain applications like large file transfers or streaming.
When it comes to security incidents, the goal is to eliminate the threat as quickly as possible. Much of what’s involved in the initial response to a security event can be easily automated, providing you get the rules right, and this automated security incident remediation is the second most prioritized use by enterprises (55%), according to EMA.
Automating the initial security response not only speeds resolution, it also frees the team to focus more closely on those areas that need direct human intervention. A common scenario around automated security incident and remediation is when an unknown application or host/IP is flagged and uses up network resources, services or enterprise bandwidth. Hosts outside the enterprise can be flagged and backlisted, and quarantined using access-list during this process.
Handling a High Volume of Alerts
As discussed, NetOps and SecOps teams face a high volume of alerts on a daily basis, and the sheer volume of noise can hide serious operational or security issues. Because artificial intelligence excels at pattern recognition, intelligent alerting/escalation (53%) is the third most prioritized use case by enterprises.
Depending on the type and level of network security breach, service policies can be set up to alert or escalate the issues. Teams can also configure basic alert and black listing for future analysis for simple network anomalies, which can block unrecognized traffic patterns that are defined via service policies.
Similarly to security incident remediation, automating the process of problem mitigation within IT services speeds up MTTR, ensuring operational efficiency. That makes automated IT service problem remediation (52%) the fourth most prioritized AIOps use case for enterprises.
To address this, teams can customize robust incident management policies based on service level or application level incidents via proper alert mechanisms (which is becoming an important priority for enterprises). Also, logging, tracking and managing policies for different incidents need to be properly planned for the right remediation.
In keeping with the above findings, most enterprises tend to start AIOps deployments and integration around network security infrastructure, such as firewalls or intrusion detection and protection solutions, to better detect anomalies, escalate alerts and remediate security issues. That said, application infrastructure including data center switching, cloud networks and application delivery network solutions are a strong secondary priority. A final focus area for for AIOps solution deployment is Wi-Fi and WAN infrastructure.
AIOps is about Data
Given these AIOPs priorities – and the fact that as with anything AI/ML related, AIOps is all about data – it’s no surprise that enterprises find data management (48%) the top skill needed for network teams. In fact, earlier research found that poor data quality is a major technical challenge in successfully applying AIOps to network and security management. Beyond a data background, enterprises prioritize general AI and infrastructure knowledge (42%) as a second skill priority.
This indicates that some enterprises might be developing internal AIOps capabilities, or want to modify commercial solutions. Similarly, both algorithm development and API skills (39%) are high on the priority list, again showing organizations are building or fine-tuning the underlying algorithms, and working hard to more broadly integrate software and tools into the AIOps landscape.
The overarching insight is that enterprises looking to be successful with AIOps are looking to supplement their network or security teams with specific data, AI, algorithm and integration skillsets.
Enterprises want efficient answers to complex problems to speed resolution. AIOps allows organizations to employ AI/ML to supplement an IT team’s ability to quickly identify and mitigate threats to overall network performance or security for issues including anomaly detection, automated security and incident remediation and more.
With new tools come new skills for NetOps teams to learn, such as data management, AI knowledge and algorithm development. Ultimately this can help these teams and companies streamline workflows, better interpret data and efficiently and securely manage the network.
ABOUT THE AUTHOR:
Jubil Mathew is a Technical Engineer at LiveAction