An explosion of systems and devices, complex architectures, pressures to deploy faster, and demand for optimal performance have placed greater and greater strain on infrastructure monitoring teams. For many, their current monitoring strategy and tools are just not enough.
Developing ever more sophisticated monitoring practices and capabilities is a journey. Enterprises should develop a capability maturity model to help them make that journey by mapping out levels of capability and steps to move from one level to the next beginning with “basic monitoring” and progressing through full machine data intelligence. If you’re ready to get more value from your monitoring efforts and regain control of your operations, the first step is to move from basic to advanced monitoring.
A significant number of organizations today have “basic monitoring” in place, characterized by having multiple teams all using disparate monitoring tools for their specific purposes and creating silos of metric data. It’s a patchwork environment where there is a lack of standards and consistent processes and as a result, there’s no ability to share information in a clear and cohesive way among different teams within the organization. Critically for executive management, there is no way to get a comprehensive and consolidated view of the health and performance of the systems that underpin the business. At this stage, all your charts and graphs may look great, but you’re only one misstep away from a potential catastrophe.
At the advanced monitoring stage, an organization has established organizational-wide monitoring by consolidating and rationalizing its monitoring and data collection capabilities across the enterprise. They can begin deriving additional value from monitoring data for use cases such as streaming analytics, fault/anomaly detection, root cause analysis, SLOs/SLAs, and error-budgeting.
Using industry information from Bob Moul, CEO of Circonus, here are five foundational components that are critical to move from basic to advanced infrastructure monitoring. Circonus is a machine data intelligence and analytics platform for demanding enterprise use cases.
Data Point No. 1: Organizational Buy-In
It is imperative that leadership establishes a data-driven culture that embraces and values the benefits of unified monitoring. Leaders up to and including the CEO need to make it a clear mandate and priority but they also need to take the time to impress upon all team members the strategic importance to the business, to explain why monitoring is so critical to business success and why decisions are being (or will be) made to change the way monitoring has been done up until now.
Data Point No. 2: A Comprehensive Inventory of Services and Infrastructure
At the heart of a robust monitoring program is an always up-to-date inventory. Document what services are running, where, how, why, their purpose and with what they connect. Develop a plan and procedures through which when new services and infrastructure get provisioned, they automatically move into the inventory and get monitored by default.
Data Point No. 3: A Monitoring Plan Linked to Business Success
Understand what’s important to the business and what you should be monitoring to ensure you meet those goals. Iterate your plan with business leaders over time to refine what metrics are most important.
Data Point No. 4: A Unified Monitoring Platform that is Metrics 2.0 Compliant
Metrics without context have no value. Monitoring solutions must be Metrics 2.0 compliant. This includes a set of “conventions, standards and concepts around time series metrics metadata” with the goal of generating metrics in a format that is self-describing and standardized.
Data Point No. 5: A Commitment to Learn and Iterate
To optimize your monitoring solution means making a commitment to continuous improvement. There will always be room for improvement and it’s more important to get started and iterate over time than to wait for immediate perfection.
Data Point No. 6: The Rewards of Advanced Monitoring
By elevating from basic to advanced infrastructure monitoring, you will have moved from being a reactive service provider to a strategic business partner able to help drive tangible business results. But you’ll get a host of other benefits as well, including:
- avoid being blindsided by preventable outages
- faster problem identification and resolution time
- the ability to answer any question at any time
- more confidence and speed in your decision making
For some organizations, basic monitoring processes and tools may be enough. But for others, it’s just not sufficient. The power and value of monitoring grows exponentially the more you can harness all your metric data to confidently make the best decisions for your organization.
If you have a suggestion for an eWEEK Data Points article, email firstname.lastname@example.org.