Page Two

By eweek  |  Posted 2002-10-02 Print this article Print

: A New Pipeline for Customer Feedback"> A New Pipeline for Customer Feedback Lets acknowledge a sad truth about software: any code of significant scope and power will have bugs in it. Even a relatively simple software product today has millions of lines of code that provide many places for bugs to hide. Thats why our customers still encounter bugs despite the rigorous and extensive stress testing and beta testing we do. With Windows 2000 and Windows XP, we dramatically improved the stability and reliability of our platform, and we eliminated many flaws, but we did not find all the bugs in these or other products. Nor did we find all the software conflicts that can cause applications to freeze up or otherwise fail to perform as expected.
The process of finding and fixing software problems has been hindered by a lack of reliable data on the precise nature of the problems customers encounter in the real world. Freeze-ups and crashes can be incredibly irritating, but rarely do customers contact technical support about them; instead, they close the program. Even when customers do call support and we resolve a problem, we often do not glean enough detail to trace its cause or prevent it from recurring.
To give us better feedback, a small team in our Office group built a system that helps us gather real-world data about the causes of customers problems—in particular, about crashes. This system is now built into Office, Windows, and most of our other major products, including our forthcoming Windows .NET Servers. It enables customers to send us an error report, if they choose, whenever anything goes wrong. There are risks in offering this option to have software "phone home" like E.T. One risk is that error reporting could compound a customers irritation over the error itself. We therefore worked hard to make reporting simple and quick. We developed a special format, called a "minidump," to minimize the size of the report so that it can be transferred in a few seconds with a single mouse click. Also, customers may wonder what we do with their reports and whether their privacy is protected. We use advanced security technologies to help protect these error reports, which are gathered on a cluster of dedicated Microsoft servers and are used for no other purpose than to find and fix bugs. Engineers look at stack details, some system information, a list of loaded modules, the type of exception, and global and local variables. Weve been amazed by the patterns revealed in the error reports that customers are sending us. The reports identify bugs not only in our own software, but in Windows-based applications from independent hardware and software vendors as well. One really exciting thing we learned is how, among all the software bugs involved in reports, a relatively small proportion causes most of the errors. About 20 percent of the bugs cause 80 percent of all errors, and—this is stunning to me—one percent of bugs cause half of all errors. With this immensely valuable feedback from our customers, were now able to prioritize debugging work on our products to achieve the biggest improvement in customers experience. And as the work proceeds based on this new source of systematic data, the improvement will be dramatic. Already, in Windows XP Service Pack 1, error reporting enabled us to address 29 percent of errors involving the operating system and applications running on it, including a large number of third-party applications. Error reporting helped us to eliminate more than half of all Office XP errors with Office XP Service Pack 2. Work continues to find and fix remaining bugs in these and other existing products, but error reporting is now also helping us to resolve more problems before new products are released. Visual Studio .NET, released last February, was one of our first products to benefit from the use of error-reporting data throughout its beta testing. Error reporting enabled us to log and fix 74 percent of all crashes reported in the first beta version. Many other problems were caught and eliminated in subsequent testing rounds. And were not keeping this great tool to ourselves. Were working with independent hardware and software vendors to help them use our error-reporting data to improve their products, too. Some 450 companies have accessed our database of error reports related to their drivers, utilities and applications. Marked decreases in some types of errors have followed. Those involving third-party firewall software, for example, have dropped 67 percent since the first of the year. Also, weve created software that enables corporations to redirect error reports to their own servers, so that administrators can find and resolve the problems that are having the most impact on their systems.


Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel