The cover of the Jan. 15 issue of the prestigious science journal Nature is striking. Viewed from above, a tennis player swings her racket toward a contour map of concentric ovals, representing her estimate of the location of the ball, as it streaks toward her. The caption archly inquires, “Anyone for bayesian integration?”
If I were trolling for a high-IQ dinner partner, Id definitely try reading that magazine—with the cover plainly visible—at the courtside cafe of the local tennis club. Im concerned, though, that the adjective “Bayesian” has been getting an awful lot of sloppy play these days in connection with e-mail filtering. Like “artificial intelligence” and “expert system” before it, I fear that a useful term of art is in danger of being muddled by hype to the point that its meaning—and its value to critical decision-support applications, as well as to mundane junk-mail cleanup—are lost.
First, the word is properly rendered with a capital “B” out of respect for the work of 18th-century mathematician the Rev. Thomas Bayes. I suppose its a backhanded compliment to the gentleman that his name has become a part of the language: Bayes even has a fan club, the (International Society for Bayesian Analysis), celebrating its 12th birthday this year.
And Bayesian analysis yields more than 50,000 hits on Google, so Bayes is probably not spinning in his grave at any lack of attention to his name. Hed take exception, though, Im sure, to its vague application, as if it were merely a synonym for “statistical” or “probability-based.” The essence of Bayesian analysis, hed be certain to say to anyone who would listen, is forming an updated estimate based on a combination of prior belief and objective observation, instead of starting from scratch without regard for prior experience. Mathematically, Bayes theorem gives us an objective way of taking what we expect before we get a new piece of information and changing that expectation based on what weve just learned.
Lets translate this to specifics of product evaluation. If a company wants to call an e-mail filter Bayesian, it should be able to answer two questions. First, how good are the initial assessments of the chance that something is unwanted e-mail, before obtaining feedback from the user? Id want to know how a particular filtering technology expresses those likelihoods, based on what criteria, and how it updates those base-line estimates as mass e-mailers adopt new techniques.
Second, Id want to know how well the tool incorporates an individual users feedback. How much of a nuisance is it for the user to provide that input, and how well does the filtering tool use it? Which is not the same, I hasten to point out, as asking how faithfully the tool does what I tell it to do because people themselves arent as consistently analytical—that is, as Bayesian—as they might like to think.
2
How do people depart from Bayesian methods? They consistently give excessive attention to data that confirm their original beliefs, but at the same time, theyre too conservative in using that data to update estimates of likelihood. In one example, subjects were presented with data showing an 80 percent chance that a given test result was positive, then given a second independent datum indicating a 67 percent chance of a positive outcome: They consistently underestimated the updated “positive” probability of 89 percent that they should have derived from those combined inputs.
Since many people find this counterintuitive, let me work out the numbers. When we start, with no information, we have to treat the two possible states as equally likely. The first positive test result changes that estimate, though: we now think theres an 80 percent chance of one state, call it T, and only 20 percent chance of the other, call it F.
A further indication, if its presumed to be independent, provides a further increase in that estimated likelihood. We dont average the probabilities. It works, instead, like this.
We want to determine the chance that the reality is T, given that the second test result is positive, conditioned on our updated belief after the first test was positive.
The overall chance of a positive outcome from the second test is 0.67 * 0.8 (the chance that were seeing an accurate result on the second test, in the 80 percent-likely case that the first test was right) plus 0.33 * 0.2 (the chance of an inaccurate positive result on the second test, in the 20 percent-likely case that the first test was wrong).
The conditional chance of T, given a positive result from the second test, is therefore the chance that the positive result is correct, divided by the overall chance of getting that positive result. The final probability is therefore (0.67 * 0.8) / ( {0.67 * 0.8} + {0.33 * 0.2} ) = 0.89.
Note that the sequence of the tests doesnt matter: the same tests in opposite order produce the same result.
Bayesian analysis yields even more striking results when were talking, for example, about a 90 percent-accurate test for something that only 0.1 percent of the population actually has: in such a case, the chance that a positive test is correct is only (0.9 * 0.001) / ( {0.9 * 0.001} + {0.1 * 0.999} ), or 0.9 percent, since the overwhelming majority of positive results are false positives despite the accuracy of the test.
Again, its a counterintuitive result.
For enterprise IT builders, this body of knowledge has important implications. If people are going to anchor on their initial beliefs, its the job of business intelligence tools to gather all possible information to shape those beliefs before they harden. If people are going to underestimate the importance of new data, its crucial to give them analytic tools that help them use Bayes insights. Merely being exposed to the results of Bayesian calculations, studies have shown, shifts subsequent decision making in a more objective direction.
Meanwhile, my prior expectation is that a Bayesian e-mail filter will fall short of that ideal—but I remain appropriately open to having my mind changed by evidence.
Technology Editor Peter Coffees e-mail address is peter_coffee@ziffdavis.com.
Editors note: This story has been modified since its orignal posting to add additional content for clarity.