By Peter Coffee  |  Posted 2004-02-09 Email Print this article Print

How do people depart from Bayesian methods? They consistently give excessive attention to data that confirm their original beliefs, but at the same time, theyre too conservative in using that data to update estimates of likelihood. In one example, subjects were presented with data showing an 80 percent chance that a given test result was positive, then given a second independent datum indicating a 67 percent chance of a positive outcome: They consistently underestimated the updated "positive" probability of 89 percent that they should have derived from those combined inputs.

Since many people find this counterintuitive, let me work out the numbers. When we start, with no information, we have to treat the two possible states as equally likely. The first positive test result changes that estimate, though: we now think theres an 80 percent chance of one state, call it T, and only 20 percent chance of the other, call it F.
A further indication, if its presumed to be independent, provides a further increase in that estimated likelihood. We dont average the probabilities. It works, instead, like this.
We want to determine the chance that the reality is T, given that the second test result is positive, conditioned on our updated belief after the first test was positive. The overall chance of a positive outcome from the second test is 0.67 * 0.8 (the chance that were seeing an accurate result on the second test, in the 80 percent-likely case that the first test was right) plus 0.33 * 0.2 (the chance of an inaccurate positive result on the second test, in the 20 percent-likely case that the first test was wrong). The conditional chance of T, given a positive result from the second test, is therefore the chance that the positive result is correct, divided by the overall chance of getting that positive result. The final probability is therefore (0.67 * 0.8) / ( {0.67 * 0.8} + {0.33 * 0.2} ) = 0.89.
Note that the sequence of the tests doesnt matter: the same tests in opposite order produce the same result. Bayesian analysis yields even more striking results when were talking, for example, about a 90 percent-accurate test for something that only 0.1 percent of the population actually has: in such a case, the chance that a positive test is correct is only (0.9 * 0.001) / ( {0.9 * 0.001} + {0.1 * 0.999} ), or 0.9 percent, since the overwhelming majority of positive results are false positives despite the accuracy of the test. Again, its a counterintuitive result. For enterprise IT builders, this body of knowledge has important implications. If people are going to anchor on their initial beliefs, its the job of business intelligence tools to gather all possible information to shape those beliefs before they harden. If people are going to underestimate the importance of new data, its crucial to give them analytic tools that help them use Bayes insights. Merely being exposed to the results of Bayesian calculations, studies have shown, shifts subsequent decision making in a more objective direction.

Meanwhile, my prior expectation is that a Bayesian e-mail filter will fall short of that ideal—but I remain appropriately open to having my mind changed by evidence.

Technology Editor Peter Coffees e-mail address is peter_coffee@ziffdavis.com.

Editors note: This story has been modified since its orignal posting to add additional content for clarity.

Peter Coffee is Director of Platform Research at salesforce.com, where he serves as a liaison with the developer community to define the opportunity and clarify developersÔÇÖ technical requirements on the companyÔÇÖs evolving Apex Platform. Peter previously spent 18 years with eWEEK (formerly PC Week), the national news magazine of enterprise technology practice, where he reviewed software development tools and methods and wrote regular columns on emerging technologies and professional community issues.Before he began writing full-time in 1989, Peter spent eleven years in technical and management positions at Exxon and The Aerospace Corporation, including management of the latter companyÔÇÖs first desktop computing planning team and applied research in applications of artificial intelligence techniques. He holds an engineering degree from MIT and an MBA from Pepperdine University, he has held teaching appointments in computer science, business analytics and information systems management at Pepperdine, UCLA, and Chapman College.

Submit a Comment

Loading Comments...

Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel