10 Reasons Companies' IT Incidents Take Time to Resolve
10 Reasons Companies' IT Incidents Take Time to Resolve
We examine why resolving companies' IT incidents is so challenging and time-consuming and offer guidelines to help improve performance and IT communications.
Systems do not have the ability to integrate or the organization has not committed the resources needed to implement integration. Solution: When the system monitoring and ticketing process involves managing disparate solutions, manual efforts must be made to proactively communicate information. Multiple log-ins and manual data entry create extra workflow steps that decrease efficiency. The goal should be to have your systems integrated so that when a problem occurs, the right information and resources needed to fix an issue are easily accessible and automatically communicated to the correct teams, customers and stakeholders. An alerting or communication platform that is integrated with your monitoring and IT service management platforms, as well as your on-call schedules, can help facilitate this and ensure information flows efficiently.
Different Communications Standards
International communication standards vary by country and can require specific protocols and routing. For example, messages have to clear China's "Great Firewall" and providers can have unique specifications per country. Solution: Ensure your international communications support global delivery with local call routing, support for local caller ID, dedicated long and short codes for improved international SMS delivery, automatic text-to-speech and prompts for multiple languages, localized Web user interface for individual or regional preferences, compliant and private local data centers, and mobile apps for those with limited cellular connectivity.
Communication Efficiency Issues
If the same efficiency issues consistently occur when communicating around incidents, you are not engaging in adequate process review, after-action reporting and improvement. Solution: After-action reporting is the first step in analyzing incidents with the goal of reducing "mean time to know." These reports should be accurate and contain relevant data on your communication tactics. You should be able to tell what paths are most effective, how long it takes to get a confirmation from on-call staff, how long it takes to set up a conference bridge, and if your incident templates provide the necessary information.
Sending Multi-path Messages
Manually sending messages via multiple paths is time-consuming and difficult to track. Solution: Organizations should use automated multi-modal alerting to notify contact paths (via voice, SMS, email, push notification and pager) that are most likely to reach recipients. For example, if a text is sent to someone where a cell signal isn't available, the message should be automatically resent as a push notification so it can be received over a wireless network on the recipient's desktop or mobile phone. According to Everbridge research, multipath broadcasts have a 79 percent higher confirmation rate than a single path, proving that individuals are more likely to engage with messages that are delivered to multiple devices.
It can be difficult to collaborate and get a confirmation that a message was received or that someone has responded to the issue at hand. Solution: Your process needs to include the ability to confirm message receipt and acceptance of tasks. Confirmation of messages ensures responders are on the case, and can also halt escalation to stop further email inquiries, limiting alert fatigue. This is especially important with system-triggered workflows; for example, your monitoring platform can trigger incident messages and tracking receipts, which are critical to confirming that messages are being received and acted upon.
Different Communications Solutions for Different Teams
Different teams may have different communication solutions. The incident management team may use a free mobile app while the network monitoring team may rely on traditional call trees and voicemails. When interdepartmental communication is required, multiple communication systems hamper the ability to collaborate. Solution: Companies should implement unified critical communication plans so that teams throughout the organization are communicating on one platform. In addition to improving the transfer of information, this will ensure that information is presented consistently.
Lack of Automation on Escalation Paths
Escalating issues to the right team may be difficult. Solution: Automation helps ensure that if the first on-call person or team does not respond, then the notification escalates to the next person or group based on the current on-call schedule.
Avoiding Alert Fatigue
Alert fatigue occurs because it's easier to communicate to as many people as possible, resulting in messages being sent to people who have no responsibility for an issue. If you can't tell what is a high priority, nothing is high priority. For various incident types, trigger notifications can be automatically launched from your monitoring and operations systems and connect to your ticketing and communication systems where they can be automatically escalated to the right people. Communication templates should include the proper recipients and escalation paths, and these templates should be easily updated when staff moves through or out of the organization.
Efficient Conference Calls
The process of receiving a conference bridge notification, finding a pen and paper to write down the number and pin to dial back in can take too long when every second counts. Solution: A communications platform that offers one-click conference bridge functionality ensures that the right team members can join a call seamlessly and quickly.
Communicating to Groups Outside the IT Team
Often the staff members with the greatest understanding of an incident, its impact and timeline for resolution are the same ones who need to describe these incident details outside the IT team. Time taken to communicate about the incident to executives takes away from time spent resolving it, but it is still critical. Solution: Communication to groups outside the IT team should be done as efficiently as possible; the best way to achieve this is by using consistent systems and processes for incident communication to both IT and non-IT teams.