Data center commissioning is an insurance policy that helps to ensure the high reliability of a data center. A proper commissioning exercise reviews and tests the data center’s physical infrastructure design to make sure it can support the projected IT load.
In this article, we’ll highlight the ten most common errors that occur when organizations attempt to commission their data centers.
Error No. 1: Failure to engage the commissioning agent prior to data center construction
The commissioning agent needs to be engaged early in the process-weeks or months before the data center is constructed. Early involvement of the commissioning agent allows for proper planning, helps in the coordination of vendor startups, and lays out a comprehensive framework for testing.
Error No. 2: Failure to align with current technology
Even an independent commissioning agent can incorporate outmoded testing procedures. Testing procedures need to take into account the age of the equipment being commissioned. Outmoded procedures are still regularly employed in numerous situations.
When commissioning a Delta Conversion Online uninterruptible power supply (UPS), for example, the commissioning agent may employ testing procedures that were originally developed for a Double Conversion Online UPS topology. This confuses the testing and command center teams since certain procedures will not make sense. The outdated procedures may also fail to test the critical functionality of the UPS’s topology-specific interior design.
Failure to Identify Clear Roles
Error No. 3: Failure to identify clear roles
All team members should have clearly defined roles in the commissioning process. Team members from the commissioning agent and owner side can form the command center group. The principal responsibilities of the command center team include process safety, communications, documentation and emergency response.
IT and facilities personnel are most often charged with performing the actual data center equipment commissioning tests, often working in conjunction with equipment vendor representatives. These groups must focus on safety and on executing procedures in the proper sequence.
Error No. 4: Failure to validate the commissioning script
The commissioning script is the road map that leads the commissioning team through the process. The script consists of line by line procedures that are communicated by the command center and executed by the testing team. All members of the team work from the same script. The author of the script is the commissioning agent. He or she assembles the script based upon weeks of interaction with various equipment vendors and IT and facilities staff.
During actual commissioning, the commissioning agent acts as a conductor of an orchestra. The team members are all like musicians with special skills. The script can be compared to the sheets of music that all the musicians follow, line by line. The script needs to be rehearsed by all the players involved. Any deviation from the script places overall system performance at risk.
Failure to Survive Project Budget Cuts
Error No. 5: Failure to survive project budget cuts
Periodic reviews of the design/build project often result in outside groups unfamiliar with the project process making recommendations on how to cut costs. Commissioning is often perceived as an easy target for cuts, particularly if the original construction schedule did not include time for commissioning tests.
Sometimes, suggestions will be made to curtail the commissioning agent’s contract and to compress the testing schedule or scope. Acquiescing to budget cuts as it pertains to commissioning will open the door to increased human error and downtime once the new data center is in operation. The long-term negative consequences on overall performance will far outweigh the short-term benefits of project budget cuts if commissioning is targeted.
Error No. 6: Failure to simulate real-world heat loads
The heat generated by higher density servers now has a major impact on physical infrastructure components which, in turn, support the uptime of the servers. Often, the UPS is not tested as part of an integrated system. Thus, only a partial evaluation can be made as to how the data center will function once it is up and running.
Fortunately, tools exist that accurately simulate the heat generated by real, rack-based server loads. This artificial load is comprised of resistive heating units installed within the racks. These units mimic a rack’s particular design load. With this artificial load installed and operational, the commissioning agent can now test UPS capacity, emergency power, cooling capacity and facility management (along with a host of other subsystems) in an integrated fashion.
Failure to Identify Weak Links in the System
Error No. 7: Failure to identify weak links in the system
Potential pitfalls exist which must be flushed out during the commissioning process. These weak links can exist in several layers of the physical infrastructure.
The UPS integrated commissioning test, for example, will place critical stresses on the UPS batteries. Each test reduces the amount of battery charge available for future tests. After several tests that require the UPS to switch to battery, the overall amount of available battery runtime is severely reduced.
Each segment of the integrated commissioning test needs to take available battery runtime into account. A best practice is to allow for sufficient battery recharge after a major power drain test.
Error No. 8: Failure to publish emergency operational procedures
The commissioning team members may not necessarily be the same individuals who will be responsible for operating the equipment in the new data center.
Clearly viewable and accessible emergency operational procedures should be affixed to each piece of physical infrastructure equipment. This procedure should also apply to key non-data center support rooms and to each Emergency Power Off (EPO) station. Examples of key non-data center rooms include the generator room, the UPS room (if separated from data center), and the chiller and pump room. It is also a best practice to have a laminated set of “as built” drawings on the walls of each room to illustrate to all interested parties how the data center was originally configured.
Failure to Consider the Impact of Human Fatigue on Test Results
Error No. 9: Failure to consider the impact of human fatigue on test results
Commissioning can be completed in a day, on a three-day weekend, or it can take several weeks. Integrated commissioning is one of the most demanding steps in the data center design/build process. The employees involved work long hours and are under constant high levels of stress. Many of the individuals involved are sleep-deprived and perform the commissioning on weekends after having worked several weeks of extended hours. This scenario creates conditions that can lead to catastrophic human errors. The commissioning agent team leadership needs to consider the fatigue level of the staff.
Error No. 10: Failure to update commissioning documentation
Once the new data center is commissioned, data center operations staff personnel are likely to change over time. If the commissioning information is kept up-to-date, then the data center knowledge base remains with the company and not with the individuals. The commissioning documentation can serve as the principal source for training of new employees. In addition, the commissioning documentation can serve as a baseline to determine when management should consider upgrading or moving the data center.
Dennis Bouley is a Strategic Research Analyst at APC. In this role, Dennis is responsible for the generation of white papers, case studies and decision support tools. Prior to this role, Dennis held various positions within APC including senior writer, senior business analyst, and marketing manager, APC France. Prior to joining APC in 1998, Dennis was employed at IBM as a client representative for over 10 years. Dennis holds two Bachelor’s degrees, one in Journalism and one in French from the University of Rhode Island. He also holds a Certificat Annuel from the Sorbonne, Paris, France. He is currently a member of The Green Grid Technical Committee. He can be reached at [email protected].