The tremendous amounts of manual time and energy required to configure, run and monitor automated software tests can come as a startling revelation to companies that have invested substantial engineering efforts into automated test frameworks. These frameworks are specifically designed to reduce the human cost of continually running large regression suites, but the process is not always truly automatic. What is at the root of this disconnect?
When smart managers start asking the right questions, they can identify the problem. Stipulating that the automated test framework is a good thing that will always have a lot of work to churn through, you can still challenge the status quo on three key axes and shed a lot of light.
First ask: why exactly does your automated test framework take so long to run and what can you do to make it faster? Second, why is it so resource-intensive and what can you do to make it more efficient? Third, and most importantly, why are people still managing it and how can you make it truly automatic?
We have had the opportunity recently to interview some of the world’s largest development organizations as they answered those questions. We were astonished at how similar the problems were across wildly different shops. Test suites take time to configure and launch, there is a perpetual arm-wrestling match with IT for resources, and staff is still spending hours going through log files, painstakingly trying to tease out the real defects from the spurious failures and bad tests.
Effective automation should obviously be about reducing manual effort. Yet the most successful test systems are backed by large infrastructure teams who are tasked solely with the care and feeding of the “automatic” system. How did this happen?
Different Companies, Same Problem
Different companies, same problem
Again, there is a striking similarity in the problem evolution across companies. Early in the life cycle, developers write an application-specific test harness to validate the nascent product. Indeed, modern best practices-especially the test-driven development (TDD) practice in Agile software development-place such a premium on being able to continually and automatically exercise components as they are written that, in some cases, the test harness matures faster than the product.
The situation is complicated when release engineering promotes the harness (originally created only to assist the developer’s edit/compile/debug cycle) to a key piece of infrastructure in the production integration and release process.
Now, suddenly, small details that could be taken for granted in the developer’s workflow (“of course your database is at localhost:3306” and “of course gcc 3.3.1 is installed at /usr/tools/bin”) require constant attention. There is no incentive for the developers to address the issues that are only faced in production. Conversely, there is no facility for the release engineering team to expose their process to developers.
Three challenges crystallize as the answers to the questions above. First, it’s hard to get the environment, including both harness and product, set up on an appropriate box. Second, it’s hard to invoke the test. Third, it’s hard to mine the results for concrete next actions.
This triumvirate-resource configuration, test execution and results monitoring-is the “last mile” of automation. When ignored, the entire value of automated testing pays a steep penalty. Fortunately, for each of these systemic problems, there are simple (though not necessarily easy) solutions we can apply to great effect.
The resource selection problem is simply this: “Where do I run the test?” At most shops, the answer itself quickly leads to a sysadmin puzzle: “Where or how can I easily log into 20 systems in one click? Where do I have SSH (Secure Shell) keys configured? How will I keep my tests from stomping on my neighbor’s? And how do I fire up virtual machines from an ESX server for my config?” and so on.
The correct answer: Stop asking these questions! Get your test harness to do it for you. There are three critical pieces you need to make that a reality-and one big trap to avoid.
Making Your Test Harness Work For You
Making your test harness work for you
First, tests need to support distributed, remote execution. Second, the target host and the build need to be logically divorced and merged together only at test invocation. Third, as much as possible, standardize on one host environment for high-volume, frequent “smoke test” runs.
The pitfall: If the test process is hard-coded to a host (“Version 3.4 can only be tested on Windows 2003 hosts with VS 2008 + our add-in”), the inefficiencies and complexity start to creep in. Standardizing on a host configuration for all active branches can open extraordinary optimization opportunities. Need to run a new system test? Grab next available node from the large homogenous grid, load product and go.
The concept of carefully subsetting the test suite to run a critical selection of smoke tests first is not new but it is essential to efficient automatic testing. The natural objection from developers is that the project-wide smoke test possibly will not cover the code they most recently modified.
Enter the second solution pattern: Test harnesses need to be highly parameterized but their invocation needs to be easily aliased. This is a common pattern you use all the time: Type “alias” at the prompt of grizzled Unix command-line hackers and you’ll see shortcut entries encapsulating commonly used options to programs such as “ls”; on Windows, the Start Menu is nothing but a tree of shortcuts to programs and documents.
Effective automated test systems pull the same stunt: a million possible options to pick the smoke tests, the server tests, the user interface tests and so forth. However, they are easily abstracted behind a single user-specific click on the harness interface. Corollary gotcha: Do not let your harness sources live with your product sources because this encourages an artificial integration that makes building such an interface much more difficult.
Intelligent Error Collection and Reporting
Intelligent error collection and reporting
Finally, the most widely cited culprit for consuming valuable person-hours is dealing with the large volumes of data produced by test systems. Automated harnesses are usually great at posting gigabytes of HTML table with green and red icons to Web pages and inboxes, but they often fall far short of making that data actionable.
Actionable reporting means that for every data point presented to the user, you ask yourself the question: “If I show them this, what will they want to see next?”
Too often, the e-mail message says, “Test run #231 failed” but leaves it to the user to identify the offending test. Too often, the Web page says, “Test 144….FAILED” without any indication of how long it has been failing. Has it been failing since last March or has it been green since 2006 but only turned red this morning with the new employee’s first check-in? Designing the test harness to make error collection-and just as important, error reporting-as intelligent as possible can have a dramatic impact on ease of use and productivity.
So, while the natural evolution of a test harness will lead it to being tedious to configure, slow to run and cryptic to interpret, there are straightforward strategies to remedy all three problems. And as shops that took the time to refactor the critical parts of resource selection, harness invocation and results reporting have shown, an easy, fast, efficient test harness may still not top your list of favorite leisure activities. However, it’s a great deal more pleasant than filling in Form 1040.
Usman Muzaffar is Vice President of Product Management at Electric Cloud. He was part of the team that founded the company. Prior to Electric Cloud, Usman worked as a software engineer at Scriptics and Interwoven (acquired by Autonomy), designing and developing content management, syndication, and distribution systems. He holds a Bachelor’s degree in Molecular Biology from Northwestern University. He can be reached at usman@electric-cloud.com.