Staffing levels within IT operations (ITOps) departments are flat or declining, enterprise IT environments are more complex by the day and the transition to the cloud is accelerating. Meanwhile the volume of data generated by monitoring and alerting systems is skyrocketing, and operations teams are under pressure to respond faster to incidents.
Faced with these challenges, companies are increasingly turning to AIOps–the use of machine learning and artificial intelligence to analyze large volumes of IT operations data–to help automate and optimize IT operations. Yet before investing in a new technology, leaders want confidence that it will indeed bring value to end users, customers and the business at large.
Leaders looking to measure the benefits of AIOps and build key performance indicators (KPIs) for both IT and business audiences should focus on key factors such as uptime, incident response and remediation time and predictive maintenance, so that potential outages affecting employees and customers can be prevented.
Business KPIs connected to AIOps include employee productivity, customer satisfaction and web site metrics such as conversion rate or lead generation. Bottom line, AIOps can help companies cut IT operations costs through automation and rapid analysis; and it can support revenue growth by enabling business processes to run smoothly and with excellent user experiences.
These common KPIs, provided for this eWEEK Data Points article by Ciaran Byrne, VP of Product Management at OpsRamp, can measure the impact of AIOps on business processes.
Data Point No. 1: Mean time to detect (MTTD)
This KPI refers to how quickly it takes for an issue to be identified. AIOps can help companies drive down MTTD through the use of machine learning to detect patterns, block out the noise and identify outages. Amid an avalanche of alerts, ITOps can understand the importance and scope of an issue, which leads to faster identification of an incident, reduced down time and better performance of business processes.
Data Point No. 2: Mean time to acknowledge (MTTA)
Once an issue has been detected, IT teams need to acknowledge the issue and determine who will address it. AIOps can use machine learning to automate that decision making process and quickly make sure that the right teams are working on the problem.
Data Point No. 3: Mean time to restore/resolve (MTTR)
When a key business process or application goes down, speedy restoration of service is key. ITOps plays an important role in using machine learning to understand if the issue has been seen previously and, based on past experiences, to recommend the most effective way to get the service back up and running.
Data Point No. 4: Service availability
Often expressed in terms of percentage of uptime over a period of time or outage minutes per period of time, AIOps can help boost service availability through the application of predictive maintenance.
Data Point No. 5: Percentage of automated versus manual resolution
Increasingly, organizations are leveraging intelligent automation to resolve issues without manual intervention. Machine learning techniques can be trained to identify patterns, such as previous scripts that had been executed to remedy a problem, and take the place of a human operator.
Data Point No. 6: User Reported versus Monitoring Detected
IT operations should be able to detect and remediate a problem before the end user is even aware of it. For example, if application performance or Web site performance is slowing down by milliseconds, ITOps wants to get an alert and fix the issue before the slowdown worsens and affects users. AIOps enables the use of dynamic thresholds to ensure that alerts are generated automatically and routed to the correct team for investigation or auto-remediated when policies dictate.
Data Point No. 7: Time savings and associated cost savings
The use of AIOps whether to perform automation or more quickly identify and resolve issues will result in savings both in operator time and business time to value. These have a direct impact on the bottom line.
Data Point No. 8: In summary …
These KPIs can be correlated to business KPIs around user experience, application performance, customer satisfaction, improved e-commerce sales, employee productivity, and increased revenue. ITOps teams need the ability to quickly connect the dots between infrastructure and business metrics so that IT is prioritizing spend and effort on real business needs. Hopefully, as machine learning matures, AIOps tools can recommend ways to improve business outcomes or provide insights as to why digital programs succeed or miss the mark.