Millions of NCAA college basketball fans have experienced it—the agony of watching a carefully chosen bracket disintegrate. But college educators Jay Coleman and Allen Lynch have found a way to minimize that type of frustration by combining basketball knowledge with modern technology.
Using business intelligence software from SAS, the duo have created a "Score Card" model that predicts NCAA Basketball Tournament game winners as well as a "Dance Card" to forecast the at-large bids for teams selected for the tournament. Their Dance Card selected the correct at-large bids about 94 percent of the time during the past several years, and their Score Card model correctly predicted the winners the vast majority of the time as well.
While the news may excite sports fans, it also underscores how forecasting solutions can be applied in the real world. Any business that has quantitative data can benefit from the technology, said Coleman, associate dean at the University of North Florida in Jacksonville.
"Yeah its basketball, but we are really talking about predicting the decision of a committee," he said.
The duos basketball-themed models are just one of the ways predictive analytical software can be used. For example, banks can use it to forecast a customers credit risk, said Anne Milley, SAS director of technology product marketing.
"Everybody wants to make decisions with confidence," she said.
Especially when it comes to March Madness. Coleman and Lynch, who is an economics professor at Mercer University in Georgia, are using SAS analytical software in their quest for the perfect bracket. In 1999, they set out trying to come up with a way to accurately predict the at-large bids for the NCAA tournament. Eventually, they came up with a list of several deciding factors: RPI (Ratings Percentage Index) rankings, conference RPI rankings, number of wins against teams ranked in the top 25, record within their conference, and record against teams ranked 26-50 and 51-100 in RPI, respectively.
The Dance Card, developed based on 1994 through 1999 data, has never missed on more than three spots in any season. During the past seven years, it has correctly predicted 224 out of the 239 available at-large tournament slots.
For the Score Card, the variables are the teams record in the last 10 games, the strength of the conference based on the non-conference RPI, whether a team won its regular season conference championship and a teams RPI value. The values for each of the four pieces of information are entered into the Score Card formula, which results in a Score Card value for that team. The team with the higher rank would be the team predicted to win the game.
Using this procedure on data from 2001 to 2004, the years of data on which the formula was built, the current version of the Score Card would have accurately predicted 74 percent of tournament games correctly. For games in 2000, 2005 and 2006, the Score Card would have accurately predicted the winner 73 percent of the time.
Arriving on the right variables took some trial and error, Coleman said. In fact, certain things he and Lynch thought would matter did not improve their success rate, he said. In addition, some variables had no positive impact on one card but increased accuracy of the other.
"Record in the last 10 games doesnt really add anything to the predictive power of the model," Coleman said of the Dance Card.
Correctly predicting behavior, whether its of customers or NCAA officials, requires more than just software though, Coleman said. Analysts also need to have knowledge of the problem they are trying to solve to determine what variables to use, he explained.