Artificial intelligence for IT operations (AIOps) is now officially an IT thing. This platform approach to new-gen technology promises to transform the efficiency and effectiveness of the modern IT operations team that’s often buried under floods of alerts, data, deadlines and pressure.
AIOps is a Gartner Research-defined platform that combines big data and artificial intelligence functionality to replace a broad range of IT operations processes and tasks, including availability and performance monitoring, event correlation and analysis and IT service management. Applications include text analytics, advanced analytics, facial and image recognition, machine learning and natural language generation.
In this eWEEK Data Points article, Bhanu Singh, Senior Vice-President Product Management and Cloud Operations at OpsRamp, offers industry information to suggest five steps any organization should undertake before adopting AIOps.
Data Point No. 1: Define the use case.
Start by identifying what AIOps can and needs to accomplish within your organization. Do you need to provide service availability through incident remediation? Should AIOps support your ITSM practice with alert escalation, suppression and de-duplication? Or, is it part of your DevOps initiative, providing continuous, actionable insights through data and metrics ingestion and inference modeling?
Data Point No. 2: Set success benchmarks.
Typically, success metrics for AIOps will include mean time to resolution (MTTR), prediction and prevention of outages, increased employee productivity and cost savings derived from reductions in man-hours via automation of repetitive manual tasks, or the elimination of multiple point tools. These success benchmarks can consistently provide validation on effectiveness and accomplishment of the use case.
Data Point No. 3: Segment data that matters.
Enterprises with expansive customer bases, like ecommerce, healthcare organizations or streaming content services, will want to ensure platform availability, low-latency data transmission and service quality by analyzing data that predicts or avoids service outages.
Alternatively, some operations teams will be more interested in data that highlights application performance, uptime, dependencies, and downstream effect on other systems.
Data Point No. 4: Make an adaptable data collection and analysis plan.
AIOps tools rely on data from the highest priority endpoints from among the potentially thousands of devices, components or customer touchpoints common sprawling IT environments.
IT operations teams must proactively plan for how to handle the various formats and states of data — structured, unstructured, or semi-structured, based on the algorithm and ingestion engine. This data may evolve over time as certain data lakes or resource may be more useful than others. In some cases, native instrumentation will be a better choice to provide cleaner datasets versus a data-agnostic tool.
These data sources will also impact analysis plans. Here, you can optimize your AIOps ingestion engine to produce actionable insights.
Data Point No. 5: Set up the automation.
Establishing automated workflows, runbooks and processes for fundamental activities like application performance monitoring, security breach alerts, and resource provisioning is paramount for AIOps readiness. Once you’ve identified the data, it’s time to automate as much as possible to leverage the effectiveness of AIOps and replace the routine tasks normally associated with alert management with more sophisticated automated scripts