How to Improve Software Quality: Lessons from Toyota's Debacle

Software testing should be part of a much larger process for software quality assurance. When complex systems are built from many subsystems, it's vital to understand, measure and manage the software quality assurance process for all systems from end to end. Here, Knowledge Center contributor Rex Black discusses software quality and testing lessons that IT professionals should learn from the debacle that recently befell Toyota.

bug_knowledgecenter_70x70_(2).jpg

How the mighty have fallen. Starting in the 1980s, Japanese companies became legendary for quality-and none more legendary than Toyota. But earlier this year, Toyota led the news due to quality problems. The situation was so severe that Toyota CEO Akio Toyoda personally appeared at a Congressional hearing, during which he said, "We know that the problem is not software because we tested it."

But is this a realistic way to think about software quality assurance? In fact, increasing indications (including reliable information from confidential sources in Japan) are that some of the problems were software-related.

So let's look at the quality and testing lessons IT professionals can draw from Toyota's debacle. Who knows, this article might help you to avoid a sweaty session in front of angry members of Congress. Let's start with that quote from Toyoda because it's so categorical-and so wrong:

"We know that the problem is not software because we tested it."

Size can deceive. Consider bridges. The Sydney Harbour Bridge, the Golden Gate Bridge and the Tsing Ma Bridge are enormous structures. However, they are built of engineering materials that are well understood such as concrete, steel, stone and asphalt-all of which have well-defined engineering, physical and chemical properties.

Being physical objects, these bridges obey the laws of physics and chemistry, as do the materials that interact with them (air, water, rubber, pollution, salt and so forth). Further, we've been building bridges for thousands of years. We know how bridges behave and how they fail. Ironically enough, given some of the lessons in this article, our ability to use computers to design and simulate bridges has increased their reliability even further.

Size notwithstanding, a bridge is a simpler thing to test than a Toyota Prius. In the complex system of systems that controls the Prius, there are too many states, lines of code, data flows, use cases, sequences of events and transient failures from which to recover.

Consider this example: Engineers at Sun Microsystems told an associate of mine that the number of possible internal states in a single Solaris server is 10,000 times greater than the number of molecules in the universe. How long do you have to test?