Modern Web 2.0 applications are “composites” that include content and services not just from inside the data center but from third-party service and content providers beyond the firewall. For example, a single composite Web 2.0 application may include shopping carts, search engines, user reviews or analytics which come from specialized, third-party service providers.
The average Web application of today comprises content and services from eight separate third-party providers, each of which can impact the performance (speed and reliability) of the user experience. In fact, external components make up an overwhelmingly significant portion of the time it takes for a user to download a Website or application.
Modern Web applications, which are more feature-rich and complex than ever, have created challenges for effective capacity management and planning; a more comprehensive approach is required. The stakes are higher than ever before, as user expectations for lightning-fast, rock-solid, reliable Web experiences reach new heights.
Consider that two seconds (down from four seconds just three years ago) is the new threshold that customers are willing to wait before growing frustrated, abandoning your Website and going to a competitor’s Website. These demands apply not just during “normal” traffic periods but during peak traffic periods as well, when customers may realize your Website is likely inundated with visitors but they just don’t care. For example, a recent study found that 67 percent of consumers expect Websites to work well regardless of how many visitors a Website may have at any given time. Plus, 78 percent will quickly switch to a competitor’s Website if they encounter slowdowns, errors and transaction problems.
Traditional Capacity Management Approaches
Traditional capacity management approaches
Within this context, many traditional approaches to capacity management are now outdated. First and foremost, traditional approaches often manage capacity based on utilization instead of managing it based on the response times that users experience when going through critical workflows on a Website. For example, an organization may determine that a particular Web server in its data center is operating at 50 percent CPU utilization and can therefore handle more load. However, this approach leaves a blind spot where performance is concerned because utilization is not linearly related to system performance and does not convey the point at which response times begin to slow down.
In addition, traditional approaches often test application performance from inside the data center, relying on synthetic traffic generated from one’s own servers in order to gauge the speed and availability of Websites and applications under various load sizes. The issue with this approach is that users don’t live in the data center. They live at the outer edges of the Internet and their experiences are subject to an extremely wide range of performance-impacting variables beyond the data center (including not just third-party service providers but also ISPs, carriers, content delivery networks and other elements).
A composite application is more than the sum of its components; it’s the whole that comes together when all the components described earlier-which often have interdependencies-work in concert. But if one component slows down, the rest of the components may not be invoked in a timely manner and there’s a cascading effect of application performance degradation.
So, testing from inside the data center only never portrays an understanding of how a user ultimately sees and experiences an application. Ironically, it’s this user experience which ultimately determines the success of a Web initiative. Internal, data center-focused approaches therefore inhibit capacity managers from identifying capacity needs beyond the firewall, which is needed in order to ensure seamless execution and delivery of composite applications.
Limitations of Traditional Capacity Management Approaches
Limitations of traditional capacity management approaches
Another limitation of traditional approaches is that they often entail reacting to performance problems by deploying additional dedicated IT resources. Not only does this require an organization to maintain significant (and often costly) overhead of idle capacity for every important resource, but given users’ rapidly diminishing tolerance for poor performance, they will likely have already abandoned your application or gone on to a competitor’s Website in the time it takes for you to react. Capacity managers need a new approach that enables them to proactively identify and fix capacity bottlenecks across all components, ideally before users become aware that a problem exists.
Finally, organizations may also make the mistake of relying on past utilization trends as a reliable indicator of future capacity requirements. For example, an online travel Website may determine that their virtualized infrastructure was able to support a mission-critical reservations booking application throughout last year’s busy season at 60 percent CPU utilization. Not only does this fail to address the issue of performance (“What was performance like for users at 60 percent utilization? Fast and reliable or slow and disappointing?”), but it also does not take into consideration any new features or functionalities added within the past year (video and product tours, for example) which may lower the utilization threshold for performance degradations.
The Clouds Impact on Application Performance
The cloud’s impact on application performance
In an effort to increase capacity for composite Web applications, many organizations are deploying virtualized or cloud-based infrastructures. However, much like third-party services, virtualized and cloud infrastructures can introduce new layers of complexity and the potential for performance degradation. Organizations, therefore, need to understand how virtualized and cloud-based infrastructures impact application performance, especially during times of peak traffic.
For example, just as organizations often base capacity decisions on utilization levels rather than performance, so do many virtualization vendors fail to correlate utilization to performance. IT virtualization technologies may provide CPU utilization figures, but this does not answer the central questions such as, “How many virtual instances can I create and how much load can I put on each virtual system before performance for users starts becoming unacceptable?” The answers to these questions should guide capacity decisions. When user performance gets close to the “unacceptable” threshold, it may be time to add a new virtual machine or other capacity.
Similarly, organizations need to closely consider performance issues, not just capacity, when considering a cloud service provider. Many organizations are attracted to the cloud by the promise of elasticity, which suggests that capacity will be there whenever and wherever it’s needed (thus making the task of capacity management much simpler). However, cloud customers need to demand answers to questions such as, “Will this extra capacity ramp-up be fast enough so that it’s invisible and seamless to my users?” and “If a ‘neighbor’ in my cloud environment experiences a sudden spike in traffic, how can I be sure my users won’t experience a decline in performance?”
Todays New Approach to Capacity Management
Today’s new approach to capacity management
Today’s composite Web applications require new approaches to capacity management. First and foremost, any effort to optimize performance under various load sizes must be based on a realistic view of the user’s actual experience, which is the only reliable source for pinpointing what user segments may be vulnerable to a performance degradation. In other words, performance testing from the user’s actual browser or device-also known as an “outside in” approach-is the only way to truly understand the user experience under various load sizes and as subjected to an extremely wide range of performance-impacting variables beyond the data center.
One point worth emphasizing here is that you need to measure user performance across all those geographies where your key user segments are based. Some third-party services may perform very differently from one location to the next (for example, New York or Los Angeles) and the cascading effect of performance deterioration described earlier can progress very differently. If you don’t test across key user geographies, you risk leaving important user segments behind.
Once you understand the user’s experience and know which segments may be vulnerable to a performance drop-off, you can then trace back through all the elements standing between these users and your data center to identify problem areas where capacity may need to be added. The source may be internal (within your data center) or external; for example, you may diagnose the source of a slowdown as a third-party or cloud service provider. Armed with this knowledge, you can then highlight and verify the performance breakdown in order to enforce service-level agreements (SLAs) and elicit the necessary capacity additions-ideally before users even become aware of the performance issue.
Instead of managing capacity based on utilization, new approaches to capacity management identify the “breaking points” of the individual elements that collectively impact the user experience under various load sizes. This represents a much more informed approach to capacity management, and enables optimal trade-offs between performance and infrastructure investment.
As an example, an organization may determine that a Web farm operating at 50 percent utilization is able to maintain an acceptable response time of just below four seconds for the most critical user segments and geographies under heavy load. However, as utilization creeps up past 50 percent, performance may begin to drop off. With this knowledge, the organization can design and partition systems based on an optimal utilization level of 50 percent, which strikes the proper balance between too many idle resources and the risk of reputation-damaging problems from not enough capacity.
Assuring Application Performance No Matter the Load Size
Assuring application performance no matter the load size
With so many variables-both internal and external-to manage, assuring consistently superior performance under various load sizes for today’s composite Web applications can seem like a daunting capacity management challenge. Organizations can address this challenge by starting with a view into how their Web pages and applications perform under various load sizes for users around the world.
For example, an organization can combine load generated from the cloud with load generated from a global testing network comprising real user desktops and devices to determine how performance holds up for key user segments (for example, Internet Explorer users in North America or Firefox users in Europe).
Proactively identifying and resolving performance issues anywhere, whether inside or beyond the firewall, significantly reduces Websites’ downtime, improves response times and speeds problem resolution. Organizations can conduct load tests before application deployment, throughout deployment, and whenever peak traffic is expected or significant changes (such as the addition of new features) are made to an application.
Capacity managers can play a unique role in helping to protect their organization’s investments, brand, customer satisfaction and revenues by knowing the optimal point of utilization at various critical touchpoints-which supports the dualistic goals of end-to-end application performance combined with informed, judicious resource allocations.
Imad Mouline is CTO of Gomez. Imad is a veteran of software architecture, research and development. Imad is a recognized expert in Web application development, testing and performance management. Imad’s breadth of expertise spans Web 2.0, cloud computing, Web browsers, Web application architecture and infrastructure, and software as a service. Prior to Gomez, Imad was CTO at S1 Corp. He can be reached at imouline@gomez.com.