In the rapidly growing Web application services environment, business transactions are well traveled. They are carried over diverse and interconnected infrastructures, through networks, application servers, firewalls and virtualized systems via heterogeneous operating systems and distributed service-oriented architectures (SOAs). This all creates a Web application environment that is highly-complex, vulnerable to multiple points of failure and difficult to manage in production.
Your challenge is to deliver the critical Web-based services in order to achieve your business and customer goals and, simultaneously, manage the performance and availability of these services 24/7. You need to optimize the user experience, predict and resolve problems before customers feel the pain, and improve service levels.
Meeting this challenge requires a new approach to Application Performance Management (APM), where IT becomes a strategic service provider and an innovation partner of the business organization. It does so by delivering high-quality, business-oriented IT services that are managed from the user’s perspective. Navigating this transformation presents many challenges. You may ask yourself the following questions:
1. “With so many interconnected parts in my infrastructure, how do I determine the source of performance problems quickly and end finger pointing between my IT operations teams?”
2. “How can I determine if users are affected by incidents before they call the help desk?”
3. “How can I correlate user transactions to the applications with which they interact?”
4. “When there are multiple problems, what is most important to fix first?”
5. “What can I do to gain insight into the business impact of poorly-performing applications?”
6. “How can I measure service-level agreements (SLAs) accurately to demonstrate the value that IT provides to the business, partners and customers?”
To deliver consistently superior services aligned with business goals, it’s critical to manage the performance and availability of your critical Web applications 24/7. This is so you can:
1. Understand the user experience, measure SLAs to spot problems before customers are affected and SLAs are breached, and deliver better customer service
2. Map all business transactions to the end-to-end infrastructure to identify the source of problems quickly and report on the scope, severity and business impact of transaction performance
3. Conduct incident triage and root-cause diagnosis to simplify troubleshooting and reduce mean-time-to-repair
This transformation is a journey of continuous improvement. While you should tailor your approach to meet your business needs, there are some common habits you can adopt to deliver the high-performing, online application services your business demands, realize more stable revenue streams and provide measurable business results.
Habit No. 1: Set and Measure SLAs on Business Processes
Habit No. 1: Set and measure SLAs on business processes
Effective business processes are critical to achieving consistently superior services aligned with business goals. Accordingly, it’s paramount to identify the business processes that matter most to your business, set and measure SLAs on those processes, monitor them 24/7 to consistently evaluate transaction success, and report results on a regular basis.
In doing so, you can understand the business impact of application performance, improve your ability to present relevant performance information in terms of the business, and bolstering the level of engagement between IT and business stakeholders. And, by providing a consistent approach to measuring business process SLAs where key criteria and performance metrics can be understood from a variety of perspectives, a consistent methodology and language between business and IT management can develop.
Habit No. 2: Monitor 100 percent of all user transactions, all of the time
To gain an accurate and complete understanding of application performance, it’s important to manage 100 percent of all business transactions end-to-end 24/7-from the browser to the backend-as they traverse the complex, multi-tiered infrastructure. By monitoring real user transactions in production environments, you obtain valuable insight into the user experience and transaction success/failure. This allows you to quickly identify, triage, prioritize and resolve problems before your customers and business are affected.
You can manage all transactions in real time and map them to specific infrastructure components to pinpoint the source of problems quickly, eliminate finger pointing between IT operations teams and significantly reduce Mean Time to Repair (MTTR). In addition, managing application performance from the user perspective makes it possible to assign priority levels to users and organize users into groups to measure performance based on business priority.
When evaluating an APM solution, look for vendors that deliver a complete, fully integrated solution that combines real-time user experience management with business transaction monitoring in complex production environments-and delivers this with little or no impact to the IT environment. While synthetic transaction monitors offer complementary functionality, they are limited in that they do not reveal what real users are experiencing in production. As a result, they cannot tell you how your system is executing against SLAs or business goals, and cannot map the user experience to the infrastructure components.
Habit No. 3: Employ predictive and proactive monitoring
Most enterprise-class IT infrastructures are highly complex, heterogeneous and distributed, which presents unique challenges for monitoring business transactions. In these environments, even small incidents-such as depletion of threads and resources, memory leaks, changes or errors-can have a significant impact on overall application performance.
To manage this, baseline monitoring and heuristic-based trending technologies provide an added layer of predictive and proactive analysis that enables you to identify and alert on problems before they impact users. These technologies also identify internal failures and the application’s interaction with the back-end systems to identify in-flight problems before they result in serious performance and availability issues.
An effective APM solution monitors all of this and more. It can even automatically notify leading system management solutions to integrate application performance insight into the broader infrastructure management environment. This can result in faster problem resolution, improved troubleshooting and streamlined management.
Habit No. 4: Prioritize incidents based on business impact
Once problems are identified, you need a way to prioritize them based on their importance to the business. By assigning business value to successful and unsuccessful transactions, you can prioritize incidents based on the importance of the user, criticality of the transaction and the severity of the issue. This provides you the real data that you need and a foundation to resolve the most business-critical issues first.
Habit No. 5: Conduct Rapid Triage and Root-cause Analysis
Habit No. 5: Conduct rapid triage and root-cause analysis
Performance issues can be challenging and time-consuming to identify and resolve in large, complex, distributed IT environments. Multiple infrastructure tiers, interconnected and distributed components, legacy back-end components, SOAs and virtualized environments further amplify this challenge. Why is this so problematic for IT organizations?
Because there are simply more places for a business transaction to fail, making it even more difficult to isolate the problem. Accordingly, when an issue arises, there is often finger pointing among the different IT operations teams because they typically use specialty tools that are uniquely designed to identify problems within their specific areas of expertise (such as networks, servers and databases).
However, by monitoring business transactions as they traverse the end-to-end infrastructure, you can gain the insight to quickly isolate the problem within the appropriate tier, identify the root cause and engage only the relevant IT operations team to solve the problem (thus reducing MTTR while optimizing critical IT resources).
Habit No. 6: Report results and assess maturity for continuous improvement
Proactive APM is a journey of continuous improvement and consistent methodologies. To make progress in this area, automated reports give you the insight you need to understand SLA compliance, performance trends and capacity planning. Additionally, historical reports and automatic baselines help identify endemic problems to improve business process maturity and keep your applications performing at the highest levels. In addition, the ability to understand changes within the application environment adds needed insight to many issues.
Prabhjot Singh is Vice President of Marketing for CA, Inc.‘s Application Performance Management business unit, CA Wily Technology. Prabhjot is an eight-year veteran of CA Wily Technology, and is responsible for all marketing functions. Prior to this role, Prabhjot held several sales and marketing management roles during his tenure at CA Wily Technology.
Before joining CA Wily Technology, Prabhjot held management positions at Citigroup’s Global Technology division, where he had responsibility for monitoring technologies for Citigroup’s application and network infrastructure. Prabhjot earned a B.S. in Computer Systems Engineering from Boston University. He can be reached at prabhjot.singh@ca.com.