Everyone’s talking about how the 35 to 40 percent energy savings promised by so-called green data center initiatives can help MIS operations dramatically reduce both their operating expenses and their environmental impact. But despite the buzz, there is not much being said about the most efficient and cost-effective way to achieve these savings-or how to measure them. Meanwhile, the ambitious claims being made by many hardware and software vendors make it hard to understand how much their particular solution adds to your operation’s overall energy efficiency.
Instead of focusing on the exotic technologies used by newly constructed, ultra-efficient data centers such as those used by Google, this article will deal primarily with the best ways to improve the efficiency of the many middle-aged and older facilities that still constitute the majority in use by enterprises, universities and government agencies. Since most real-world facilities have finite budgets and cannot afford downtime, we’ll pay special attention to upgrades that offer fast payback periods and pose a minimum of disruption to normal operations.
Metrics, models and methodologies
One popular efficiency ratio that we’ll use to understand the effectiveness of our upgrades is the Power Usage Effectiveness (PUE) factor, defined as the ratio between the power consumed by the data center facility itself and the power consumed by its IT equipment.
PUE = Total Facility Power/IT Equipment Power
Although PUE is a widely-accepted way to describe how data centers use their energy, care must be used when interpreting the results it produces. One must always keep in mind that PUE is most useful for tracking the effect of changes you make on the infrastructure side and less useful for tracking the improvements resulting from reducing the energy consumption of your data center’s IT equipment. Because it is a non-scalar ratio, cutting your equipment power consumption can actually result in a higher PUE.
The data center we’ll use in this example is a traditional “MIS center” that supports the computing and networking needs of a mid-sized, brick-and-mortar enterprise-perhaps an insurance company, manufacturer or biotech operation. Its 7,500-square-foot floor plan was carved out of an underused floor of the company’s headquarters back in the late 1980s. This was a time when equipment was expensive, energy was cheap and the room still bears the white walls, white floors and bright lighting that were the standard decor of the era. The mainframes and mini-computers it once housed have given way to a patchwork of servers and disk farms that have been added over the years as demand dictated and space permitted.
Our model data center’s power consumption has steadily grown with the addition of new equipment and now stands at around 1.5MW. Although the average PUE for all data centers is somewhere around 2.7, most of the “mature” facilities we’re concerned with have PUEs that range from 3.0 to 5.0, so it’s reasonable to assume that ours runs in the neighborhood of 4.0.
Taking Inventory
Taking inventory
In order to measure progress, one needs a base line, so the first step is to take a rough energy inventory of all your major equipment. This “clipboard approach” involves using a clamp-on ammeter to monitor the current on each unit’s power inputs to quickly establish the facility’s base-line consumption. In a typical 1-5MW enterprise data center, only 10 to 20 measurements will allow you to account for 80 to 90 percent of your energy budget, although not at any great degree of granularity. While this rough inventory will probably be somewhat inaccurate (+/- 20 percent), it will still provide a very good starting point that identifies the major power loads-and the biggest opportunities for energy savings.
Once you’ve recorded the power consumption of all your major subsystems, create a matrix which groups the subsystems into three functional categories:
1. IT (servers and storage)
2. Other (cooling and power distribution loss)
3. Lighting
The base-line audit of our “typical” data center reveals that cooling accounts for 50 percent of our power bill, with another 14 percent eaten by power distribution losses and 2 percent used for lighting. Our servers use a bit over 23 percent of the power, with the storage arrays accounting for another 10 percent. This means that only 500kW-or 33 percent of the 1.5MW it consumes-is actually used for data processing.
This is very typical of what you’d expect to see in a mature facility. Whatever your particular results are, they will give you a good idea of where the bulk of the power is going, and will help you to identify the most rewarding energy conservation strategies to pursue at this time.
Use the base-line measurements you took to create a matrix that normalizes the potential energy savings of each strategy with respect to the data center’s overall power consumption. For example, an improvement that cuts your cooling system’s energy consumption by 20 percent should be multiplied by 50 percent (the fraction of the overall power budget that the cooling system consumes) to give you a normalized savings of 10 percent. Using this matrix makes it easy to identify the biggest energy-saving opportunities for your particular facility.
Selecting First Tier of Upgrades
Selecting first tier of upgrades
The matrix you’ve built can now be used to select the first tier of upgrades-the ones that are low-capital, minimal-disruption, high-return and often pay for themselves in three months or less. In most non-state-of-the-art operations, the most productive first-tier techniques usually include the following four:
1. Creating hot and cold aisles by rearranging your equipment racks with the cool front panels facing each other in one aisle and the hot vented ends facing each other in the next
2. Raising the temperature of the water in the chillers (typically from 55 to 58 degrees). This must be done incrementally to ensure your equipment can tolerate the change without problems
3. Raising your data center air temperature (typically three to five degrees). Again, this should be done incrementally, a degree at a time
4. Identify and turn off unused equipment
Upgrade, analyze, repeat
Every time you make one of these changes, break out your ammeter and document the savings it produces. The new matrix will also show how the data center’s energy consumption patterns and composition have shifted as a result of the upgrades. This information will help you select the next strategy to implement, which will give you the most savings for your investment. The hypothetical measurements used in our test case shows that implementing all these first-tier steps has dropped our data center’s PUE from around 4 to 3, resulting in about 33 percent less energy use.
The matrix technique is also a valuable tool for evaluating vendors’ claims about the energy savings their product will give you. For example, a virtualization software vendor may claim that their product provides energy savings of 80 percent, but if it only applies to one third of your servers (which consume one third of your power), the actual savings would be nine percent.
Medium Effort Initiatives
Medium effort initiatives
This second set of improvements to your facility requires more effort, more disruption and some modest capital investments, but they still yield sub-one year ROIs. These include the following four improvements:
1. Server virtualization: Make more efficient use of fewer machines
2. Storage consolidation: Much like server virtualization, this technique uses a single storage array to support multiple customers or business functions
3. Storage optimization: Install software that allows low-priority storage units (Tier 3 and backup units) to spin down when unused
4. Monitor your storage systems for signs of end-of-life (EOL). As they age, disks start to suffer from excessive seek operations, increased spindle drag and other degradations that increase power consumption and reduce reliability. Most experts suggest that a three-year cycle for rotating media will cut your storage array’s energy bill by 10 percent.
Equipment replacement
Finally, it’s time to consider equipment replacement. Swapping several old power hogs for a single, new virtualized server or consolidated storage unit should be a part of your normal capital management cycle to spread expenditures and minimize disruptions. Just remember: while upgrading to more energy-efficient units and applying virtualization techniques will cut your overall energy consumption, it will not reduce your PUE and perhaps will even raise it.
Each time you perform a second-tier improvement, remember to re-audit your system. By this time, you should start to see some significant shifts-both in how much energy your data center is using and where it’s being used. These new patterns will help you identify the next most cost-effective target of opportunity.
If your facility is similar to most of the corporate data centers operating today, it’s likely that using even a significant fraction of these measures will allow you to cut your base-line energy consumption by 40 percent.
Taking Stock and Deciding
Taking stock and deciding
With the easy tasks out of the way, it’s time to take stock and decide if you need to implement any high-effort measures-the sorts of things large Internet companies such as Google spend millions on R&D to design in from the start. The running energy benchmarks you’ve kept during the first two phases will be a big help in figuring which, if any, of the more costly upgrades (such as containerization, flywheel uninterruptible power supplies (UPSes) or high-voltage power distribution) are worth considering.
Cutting your data center’s energy consumption by 35 to 40 percent is practical, cost-effective and will make you a hero in the eyes of your management. Just remember the following four points:
1. Establishing a base-line breakdown of your data center’s energy consumption and repeated benchmarking of any improvements you make are critical to achieving best results.
2. PUE is a very good tool for understanding your data center’s energy profile-but it can be misleading unless you understand what it actually measures.
3. Don’t believe vendors’ marketing hype; analyze their claimed benefits within the context of your own data center’s needs and “personality.”
4. High-effort strategies are very effective, but they’re usually most useful during scheduled forklift retrofits or new construction. But even without these, most data centers can achieve remarkable energy savings.
Joe Polastre is co-founder and Chief Technology Officer at Sentilla. Joe is responsible for defining and implementing the company’s global technology and product strategy. Winner of the 2009 Silicon Valley/San Jose Business Journal 40 Under 40 award and named one of BusinessWeek’s Best Young Tech Entrepreneurs, Joe often speaks about energy management and the role of physical computing, where information from the physical world is used to make energy efficiency decisions. Before joining Sentilla, Joe held software development and product manager positions with IBM, Microsoft and Intel. Joe is active in numerous organizations including The Green Grid, US Green Building Council, ACM and IEEE. Joe holds M.S. and Ph.D. degrees in Computer Science from University of California, Berkeley, and a B.S. in Computer Science from Cornell University. He can be reached at joe@sentilla.com.