Setting the Programmer-Error Record Straight

 
 
By Peter Coffee  |  Posted 2004-05-17 Email Print this article Print
 
 
 
 
 
 
 

Research results cited in a previous column were flawed by faulty data reduction.

Norman Augustine, a board member and former chairman of Lockheed Martin, is widely known for the wit and wisdom published as "Augustines Laws." One of his best-known maxims is that productivity, regardless of the field of effort, is concentrated in a remarkably consistent way.

Even in such diverse measures as the arrests made by police officers, the touchdowns scored by football players or the patents obtained by industrial companies, Augustine found that the top fifth of any group does about half the work, and the bottom half does about a fifth of the work. I suggest that this often makes it misleading to compare the average performance of one group with that of another. If your goal is to build a top-flight team, its as least as important to know how bad the groups worst-case performers are likely to be.

Software development, for example, is striking in the gap between best and worst. Barry Boehm, in his landmark 1981 book "Software Engineering Economics," said it concisely: "All other factors being equal, a 90th-percentile team of analysts and programmers will be about four times as productive as a 15th-percentile team." To look at it another way, Id say that hiring from anywhere but the top of the barrel looks like false economy: Even if it costs three times as much per man-hour to hire the best, its likely to be a bargain.

Software productivity is hard to measure or even define. It takes more work to use fewer lines of code; that optimization also paves the way for subtle errors. As Augustine warned, "The last 10 percent of performance generates one-third of the cost and two-thirds of the problems." But users may insist that they need that speed.

Unfortunately, I fell into both of the traps Ive described—comparing aggregate measures and failing to allow for different goals—when I shared some research results here two weeks ago from the MIT Center for eBusiness. Ive since learned the results I cited were flawed by faulty data reduction; I feel obligated to report the updated numbers.

The initial report made a crucial error in treating nonresponsive answers as zero values, which, of course, is not the right thing to do. Upon recognizing this error, the research team either filled in the blanks with actual numbers or removed the affected projects from the statistics.

The final results from a worldwide survey of 104 projects were published in IEEE Software for November/December 2003. During the first 12 months after implementation, defects were reported at an overall rate of 15 per 100,000 lines of code. That is, Im sorry to note, five times the rate I reported to you after reading the earlier results released last June. (Note that the report uses median values, rather than means, to minimize the impact of isolated extreme values—a useful precaution when working with relatively small samples.)

Regionally, the differences are huge. Japans sample of 27 projects yielded a median of only two defects per 100,000 lines—that is, half had more and half had fewer. The 22 projects from the "Europe and other" category had their 50-50 split at just under 23 defects per 100,000 lines; India came in slightly higher, at just over 26, in a sample of 24 projects.

Among the sample of 31 U.S. projects, the midpoint was 40 defects per 100,000 lines of code. Ouch! U.S. output of 270 lines of code per programmer-month was higher than Indias figure of 209 but well below Japans 469 or the 436 reported for the rest of the world.

But the authors of the report are quick to warn—as I should have emphasized more strongly in my previous column—that "the performance differences observed between regions are likely due in part to the differing project types, underlying hardware platforms, coding styles, customer types and reliability requirements. The numbers are therefore descriptive only of the data in this sample and not as the basis for projecting the performance of future projects in each region." Duly noted.

Measuring the right things, comparing the numbers in the right way and knowing what decision you intend to make before you start collecting data are good practices. I wince to be reminded in such a public way that I dont always follow those practices myself.

Technology Editor Peter Coffee can be reached at peter_coffee@ziffdavis.com.

Check out eWEEKs Developer & Web Services Center at http://developer.eweek.com for the latest news, reviews and analysis in programming environments and developer tools.
Be sure to add our eWEEK.com developer and Web services news feed to your RSS newsreader or My Yahoo page:  
 
 
 
 
Peter Coffee is Director of Platform Research at salesforce.com, where he serves as a liaison with the developer community to define the opportunity and clarify developers' technical requirements on the company's evolving Apex Platform. Peter previously spent 18 years with eWEEK (formerly PC Week), the national news magazine of enterprise technology practice, where he reviewed software development tools and methods and wrote regular columns on emerging technologies and professional community issues.Before he began writing full-time in 1989, Peter spent eleven years in technical and management positions at Exxon and The Aerospace Corporation, including management of the latter company's first desktop computing planning team and applied research in applications of artificial intelligence techniques. He holds an engineering degree from MIT and an MBA from Pepperdine University, he has held teaching appointments in computer science, business analytics and information systems management at Pepperdine, UCLA, and Chapman College.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Rocket Fuel