Setting the Programmer-Error Record Straight

Research results cited in a previous column were flawed by faulty data reduction.

Norman Augustine, a board member and former chairman of Lockheed Martin, is widely known for the wit and wisdom published as "Augustines Laws." One of his best-known maxims is that productivity, regardless of the field of effort, is concentrated in a remarkably consistent way.

Even in such diverse measures as the arrests made by police officers, the touchdowns scored by football players or the patents obtained by industrial companies, Augustine found that the top fifth of any group does about half the work, and the bottom half does about a fifth of the work. I suggest that this often makes it misleading to compare the average performance of one group with that of another. If your goal is to build a top-flight team, its as least as important to know how bad the groups worst-case performers are likely to be.

Software development, for example, is striking in the gap between best and worst. Barry Boehm, in his landmark 1981 book "Software Engineering Economics," said it concisely: "All other factors being equal, a 90th-percentile team of analysts and programmers will be about four times as productive as a 15th-percentile team." To look at it another way, Id say that hiring from anywhere but the top of the barrel looks like false economy: Even if it costs three times as much per man-hour to hire the best, its likely to be a bargain.

Software productivity is hard to measure or even define. It takes more work to use fewer lines of code; that optimization also paves the way for subtle errors. As Augustine warned, "The last 10 percent of performance generates one-third of the cost and two-thirds of the problems." But users may insist that they need that speed.

Unfortunately, I fell into both of the traps Ive described—comparing aggregate measures and failing to allow for different goals—when I shared some research results here two weeks ago from the MIT Center for eBusiness. Ive since learned the results I cited were flawed by faulty data reduction; I feel obligated to report the updated numbers.

The initial report made a crucial error in treating nonresponsive answers as zero values, which, of course, is not the right thing to do. Upon recognizing this error, the research team either filled in the blanks with actual numbers or removed the affected projects from the statistics.

The final results from a worldwide survey of 104 projects were published in IEEE Software for November/December 2003. During the first 12 months after implementation, defects were reported at an overall rate of 15 per 100,000 lines of code. That is, Im sorry to note, five times the rate I reported to you after reading the earlier results released last June. (Note that the report uses median values, rather than means, to minimize the impact of isolated extreme values—a useful precaution when working with relatively small samples.)

Regionally, the differences are huge. Japans sample of 27 projects yielded a median of only two defects per 100,000 lines—that is, half had more and half had fewer. The 22 projects from the "Europe and other" category had their 50-50 split at just under 23 defects per 100,000 lines; India came in slightly higher, at just over 26, in a sample of 24 projects.

Among the sample of 31 U.S. projects, the midpoint was 40 defects per 100,000 lines of code. Ouch! U.S. output of 270 lines of code per programmer-month was higher than Indias figure of 209 but well below Japans 469 or the 436 reported for the rest of the world.

But the authors of the report are quick to warn—as I should have emphasized more strongly in my previous column—that "the performance differences observed between regions are likely due in part to the differing project types, underlying hardware platforms, coding styles, customer types and reliability requirements. The numbers are therefore descriptive only of the data in this sample and not as the basis for projecting the performance of future projects in each region." Duly noted.

Measuring the right things, comparing the numbers in the right way and knowing what decision you intend to make before you start collecting data are good practices. I wince to be reminded in such a public way that I dont always follow those practices myself.

Technology Editor Peter Coffee can be reached at

/zimages/5/28571.gifCheck out eWEEKs Developer & Web Services Center at for the latest news, reviews and analysis in programming environments and developer tools.
Be sure to add our developer and Web services news feed to your RSS newsreader or My Yahoo page: