With virtualization, data center resources are shared in an adaptive, large communal pool of dynamic capacity. Because capacity is shared, there is a ripple effect in the data center cluster; when one application or virtual machine zigs, others must zag in response since capacity is finite. As a result, capacity planning and management is a higher stakes game in a virtual environment. IT shops need tools that intimately understand this dynamic new layer and can help ensure that adequate capacity is on hand when it’s needed.
So, what exactly is capacity management? Let’s start with some level-setting definitions. According to the Information Technology Infrastructure Library (ITIL), capacity management is the discipline that ensures IT infrastructure is used in the most efficient, predictable and cost-effective manner. In basic economic terms, it’s about balancing business demand with IT supply. All organizations practice capacity planning and management in some shape or form. It may not be as structured or programmatic as ITIL, but certainly the need is there and, fundamentally, we all do it to some degree.
To be clear, capacity management is not just simply about ensuring enough IT capacity for the business. That part is very easy to do; anyone can guarantee enough capacity if you just over-purchase or over-provision your capacity. The key goals are efficiency and predictability.
It’s about finding the optimal balance of IT supply to meet the business demands at all times. It’s about reducing costs while minimizing waste and risk. As a result, effective capacity management ensures two things:
1. Efficiency (optimization of capacity): using every last bit of available capacity, without impacting the business.
2. Predictability (availability of capacity): ensuring capacity is always available and always on, whenever the business needs it.
Why Capacity Management Is Important
Why capacity management is important
Whether your data center environment is physical, virtual or a hybrid, capacity management is an increasingly critical function in any IT organization today. Many companies are looking to implement a formalized capacity management model for three reasons:
1. Cost savings
It is very difficult to get budget approvals, and it is tedious to wait for long purchasing cycles of new hardware or infrastructure. In the past, once IT departments got a budget, they avoided these administrative headaches by pre- or over-purchasing hardware.
Often this hardware is not used until one month, one year or longer after it’s purchased. This hardware and its resources sit idle at an expense. But with the right capacity management tools and processes in place, you can rationalize any purchase and guarantee that any new hardware will be deployed and utilized immediately.
2. Service availability
IT departments need to provide consistent, quality service to their business owners. This is difficult when capacity demands are variable and fluctuate accordingly. Without proper capacity management, IT risks lower service availability and customer satisfaction. This is very costly and may impact your business viability-especially if you’re talking about mission-critical, externally-facing applications.
3. Business planning
Like business owners, IT departments are required to have short-term and long-term plans. Creating this plan requires understanding historical capacity utilization and forecasting future capacity needs. Unless this is done systematically, you will lack the historical perspective and insight to accurately forecast future needs, especially in a dynamic virtual environment.
If capacity management is not done right, or not done at all, supply ends up out of balance with demand, resulting in wasted or insufficient resources. Wasted resources, whether purchased too soon or in excessive quantities, can be very costly. However, having insufficient resources is even worse, as this can impact how the business performs and is perceived.
Capacity Management in Physical vs. Virtualized Worlds
Capacity management in physical versus virtualized worlds
In the physical world, capacity management is very straightforward and simple. Traditionally, capacity is project-driven based on the requirements of an individual line of business (LOB). In a one-application-per-server model, the business owner knows exactly what capacity is available. It’s very clear, delineated and siloed; the server and all of its capacity are owned by one user or application.
Unfortunately, this resource silo leads to a fundamental dilemma: an explicit tradeoff between efficiency and predictability. In the physical world, efficiency is usually achieved when you plan for the short term (today). If you want to be highly efficient, you just provision IT capacity to your highest peak. However, you will encounter risk when capacity demand spikes unexpectedly.
Predictability is achieved when you plan for the longer term time horizon (tomorrow). If you try to mitigate any risk by over-provisioning, you will have unnecessary waste. This “spare” capacity represents capacity into which you can grow. Unfortunately, the physical world too often requires optimizing for one goal or the other: predictability or efficiency. If an environment is fully efficient, it lacks the spare capacity needed to be entirely predictable. Adding additional capacity, which is the common response, may ensure predictability but it results in inefficiency or waste.
Virtualization forces capacity planning, purchasing and provisioning decisions to be driven top-down in the context of an aggregate resource pool. Virtualization allows your capacity to be shared and adaptive-two fundamental benefits of virtualization. Sharing allows capacity to behave as a communal pool of resources for both used and spare capacity. Adaptability allows capacity to expand and contract on an as-needed basis.
If an application or a VM needs more resources, then it can be borrowed from another VM that may not need as much. Further, adaptability and flexibility allows adding compute, storage and network capacity in granular increments. For example, when a server is added to a resource pool, its capacity becomes “communal property” not owned by any particular application or even business unit. It becomes a slice of capacity that can be used by any VM that needs it most. With virtualization, capacity can be optimized for both efficiency and predictability by matching peaks and valleys of individual applications at the right time across the environment.
New Requirements and Challenges in Virtual Environments
New requirements and challenges in virtual environments
Although capacity management can be fundamentally improved with virtualization, there are new challenges, risks and opportunities in managing capacity in a virtual environment. The impact of any demand fluctuation in a shared environment is felt far and wide, by all business units and applications in that cluster, for example. Poorly managed capacity can cause a huge ripple effect.
Five examples of new requirements and challenges in a virtual environment include:
1. Virtualization introduces new considerations such as VM mobility and automated restart or failover, which have important capacity implications.
2. Capacity fragmentation and over-allocation, if not managed, can grow into significant waste across thousands of VMs, hosts and storage.
3. Resource bottlenecks need to be identified at the granular level, and can be made worse by adding more of a given resource such as CPU or storage when you don’t need it.
4. Over-allocated VMs represent wasted capacity, which can be identified and reclaimed.
5. Poor VM placement can decrease utilization and cause resource contention.
Given this new world of virtualization and the increasing importance of effective capacity management, it is critical that IT organizations have a clear strategy in this area.
Alternatives for Managing Capacity in Dynamic Virtual Environments
Alternatives for managing capacity in dynamic virtual environments
Regardless of the practices and technologies used, the ultimate goal for capacity management is to balance IT supply with demand while maximizing efficiency and predictability. Fundamentally, it’s about developing capacity intelligence by understanding the following four things:
1. How much capacity you have (current/future, used/free)
2. How the capacity is being used (by whom and when)
3. How much capacity you will need (current/future)
4. When you will run out of capacity
Given the challenges and considerations in a fluid virtual environment, this capacity intelligence needs to be closely tied to the virtualization layer and delivered as real-time as possible.
Capacity management approaches
There are many approaches to capacity management but, generally speaking, there are three different approaches: rule of thumb, homegrown solutions or purpose-built tools.
Approach No. 1: Rule of thumb
Rule of thumb involves guesstimates based on past experience. For example, in the past, four VMs can generally run on one core, so going forward the same assumption is used. Obviously there are serious drawbacks with this approach in a dynamic environment, including inaccuracy and the inability to establish a systematic process around this approach.
Approach No. 2: Homegrown solutions
Homegrown solutions include scripts and spreadsheets. This is a more systematic approach than rules of thumb and, in the case of scripts, it may work in larger enterprises with sophisticated IT skills. However, this approach can quickly become expensive and time-consuming to maintain-and may also be inaccurate, especially with a rapidly changing infrastructure. In a virtual environment, there are many intricacies in how VMs interact with the layers of infrastructure, so it is hard to do this right with a great amount of expertise.
Approach No. 3: Purpose-built tools
Purpose-built tools are the preferred approach for a virtual environment because they take the guesswork (and much of the labor) out of collecting and maintaining capacity information in a constantly changing environment. Perhaps most importantly, tools that are closely integrated with and aware of the virtualization layer can provide highly-reliable and real-time intelligence.
With the right tool and process in place, IT administrators will have automated, real-time capacity intelligence to make day to day and strategic capacity management decisions in a virtual environment.
Rob Smoot is a Group Product Marketing Manager at VMware. Prior to VMware, Rob held various positions in product management, strategic planning and sales operations at Veritas Software, and was a management consultant at Andersen LLP. Rob graduated from Brigham Young University, and received a MBA from Wharton at the University of Pennsylvania. He can be reached at rsmoot@vmware.com.