Proper quality assurance (QA) of a data center is not a trivial matter. We need to ensure that our data centers are fit for the purposes to which we and our users put them. The first step in assuring quality in your organization’s data center is to make sure you understand what “fit for the purpose” means for your data center. You can start at a high level, identifying the extent to which functionality, interoperability, usability, performance, security and other attributes are important.
Then, for each of those attributes, you need to understand the specific risks that exist. Are you worried about leakage of personally identifiable information (PII) to unauthorized parties? That’s a specific security risk. Maybe you’re also worried about slow system response to user input? That’s a specific performance risk. We find that it’s fairly typical for a real-world system to have 100 to 200 specific quality-related risks.
Once you understand what risks exist for the quality of your systems, you’ll need to figure out how to manage those risks appropriately. You can and should try to introduce proactive measures into your application development and acquisition processes that reduce risk upfront, but you’ll find that you do need to organize tests that address these risks. We refer to this as covering the quality risks, analogous to the way insurance can cover risks such as fires, collisions and medical conditions.
One of the trickiest parts about covering your quality risks with tests is the selection or creation of proper test data. In cases where the application doesn’t exist yet, you’ll need to figure out a way to produce realistic-sized, production-like data sets. Tools are available for this but often we find it’s better to create our own tools since data tends to be very application-specific.
In cases where the application already exists, you might have access to production data in your data centers. The challenge here is that production data is often sensitive stuff, full of confidential or private information. In these cases, anonymization of the data is important. This is another job that often requires tools.
You’ll also need a test environment, including hardware, cohabitating software, networks and so forth. In some cases, you can use smaller replicas of the production environment. However, for reliability and performance testing, such scaled-down models tend to produce misleading results. If a full-scale replica of the production environment is not possible, you might be able to use an outside testing lab. Some labs exist which can replicate common environments such as those for e-commerce sites and other browser-based applications.
Building and Executing Test Cases
Building and executing test cases
With the environment set up, you need to build and then execute the test cases. While you can involve users in these tasks, building and running tests are skilled activities. You should consider using professional testers, either as contractors or as full-time employees. Be sure to plan ahead, as it usually takes a couple of months to get a good test team and test process set up.
You should also plan ahead for the bugs you are going to find. It is very rare to run a set of tests and not find bugs. With systems you’ve built in-house, you need to be ready to have the programmers fix the bugs. With acquired software, check your contract (ideally before signing it!) to see what your options are.
Finally, make sure you have a process in place for managing these risks on an ongoing basis. The ever-accelerating march of computer technology means that data centers change faster and faster. For example, think about the security risks created by USB drives and syncable mobile phones over the last decade. As another example, how about the occasional issues created by automatically-downloaded updates to software?
If you want to avoid getting back into a situation where your data center is suffering unexpected problems during routine use, you’ll need to integrate quality risk management into your overall application development and acquisition life cycles.
If this all seems like a lot of work, you’re right. Proper QA of a data center is certainly not trivial. However, consider the costs associated with incidents such as data center crashes, incorrect operations, security breaches and data corruption. In our experience, we typically find that every dollar spent on testing saves anywhere from $2 to $32 in avoided failure costs-and that doesn’t account for the less-quantifiable stuff such as opportunity costs and reputational damage you can suffer. Just ask Amazon, Google and RIM about that!
Rex Black is President of RBCS. Rex is also the immediate past president of the International Software Testing Qualifications Board and the American Software Testing Qualifications Board. Rex has published six books, which have sold over 50,000 copies, including Japanese, Chinese, Indian, Hebrew and Russian editions. Rex has written over thirty articles, presented hundreds of papers, workshops and seminars, and given over fifty speeches at conferences and events around the world. Rex may be reached at rex_black@rbcs-us.com.