IT organizations struggle with operating at an optimal resource utilization state. When IT resources are underutilized, they cannot artificially create demand to maximize their utilization. When IT resources are overextended, they cannot quickly create new capacity. While the opportunity exists for the overextended enterprise to potentially leverage a publicly available cloud infrastructure, there are quality of service (QOS), trust and control issues that exist.
IT can anticipate demand to a degree, but they cannot generate it when resources are idle. When you attach a windmill to your home, you can expect an average amount of wind over the year. But when there is no wind, you cannot generate your own wind. On the other hand, when you are generating more than you need, you can sell the power generated back to the grid. Ideally, the power can be dynamically allocated between all resources that need it.
Some would argue that if you took inventory of what you already have, such as bringing your own reusable cloth bag in the supermarket checkout, then you have achieved some savings. I personally have not spent much time thinking about how to best optimize my grocery bags like some have, but I have certainly thought a great deal about how to maximize IT infrastructure. The following are three steps to maximizing an IT infrastructure.
Steps to maximizing an IT infrastructure
Step No. 1: The first step to efficiency is to build an understanding of the entire inventory at your disposal. Once you are fully aware of the resources available to you, then you can create a strategy for using those resources to their fullest.
Step No. 2: The next step is to understand how those resources are dependent upon each other. From that knowledge, you can plan the most effective way to manage those relationships without adversely affecting the application owners’ experiences or service levels. Understanding and documenting what service level thresholds are expected in your environment is key to maximizing the usability and ultimate reclamation of your resources.
Step No. 3: Creating a detailed matrix of the individual capacity of each component of your infrastructure will help establish a baseline. For example, isolating CPU, memory, storage and network capacity as initial metrics will help you to collect accurate metrics for actual usage of these elements over a given period (such as 40 hours). In that time, maintenance windows and non-peak hours might be less relevant to your usage analysis.
Collecting Usage Metrics
Collecting usage metrics
However, collecting usage metrics may not be an easy task. Installing system management agents on your virtual machines that were designed for the older generation physical servers may artificially skew your results by overburdening the hypervisors with extra CPU and network.
Collecting metrics via agentless may also prove challenging if you are working with several hundred VMs. Creating a matrix of usage compared to capacity you have allocated will give you a pretty good idea of where the efficiency of each of your VMs is.
Immediately, the outliers will be evident. VMs that track well below what they have been allocated during peak hours may be ideal candidates for reduction and their capacity can be recycled. VMs that are over-utilized may benefit from a reallocation of resources. Alternatively, if something may be wrong, a root cause analysis may expose a configuration change from the original template causing the error.
Once armed with an understanding of the utilization range for each VM in their given capacity, a threshold window can easily be crafted. In an ideal world, a VM should not be over-allocated, nor should it be under-allocated or, even worse, left entirely idle consuming precious resources. Typically, a VM should average in the 65 to 80 percent range for Web servers. If your VM is averaging 20 percent, then maybe it’s time to rethink your allocation.
Negotiating Customer Expectations
Negotiating customer expectations
The real question is what customers expect from their thresholds. This is something that you will ultimately have to negotiate. Most application owners would prefer to be over-allocated to account for the unanticipated, which translates to ongoing waste of memory and storage. Achieving a balance between usage and capacity will ultimately enable the lowest cost VMs for your customers and applications owners, extending the usable life of your existing servers, storage area networks (SANs) and network infrastructure.
Finally, through this process, several optimal VMs will be identified within a threshold negotiated with the application owners. These VMs can be converted to templates for future provisioning.
Clearly, this manual process would benefit greatly from an automated process that takes into account dynamic changes in your environment, as well as input from your customers. Until those products are available, you can reduce, reuse and recycle some of your most optimal VMs through this process.
John Suit is Principal Founder and CTO of Fortisphere. John founded Fortisphere in 2006, and is responsible for developing the core technology behind the Fortisphere product suite. Prior to founding Fortisphere, John was the founder and CTO of SilentRunner, a successful company that was ultimately sold to Computer Associates. John has held several leadership positions at both vice president and CTO levels, and he has invented and launched countless new products in the security space.
John continues to advise the Department of Defense and Directorate of Central Intelligence in the areas of virtualization security and management, as well as information operations. He can be reached at john.suit@fortisphere.com.