Many presumed online services are now in the dustbin of history. But one, data mining via the Internet, makes a lot of sense to its practitioners at WhiteCross Systems of London, Chicago and San Francisco.
WhiteCross founders came out of the massively parallel machine producers, Teradata. A Teradata machine was adopted by Wal*Mart for data mining and was one of the secrets behind its phenomenal growth. Wal*Mart store managers could query the Teradata machine in Bentonville, Ark., to find out what was selling at their own stores and Wal*Mart stores in the region. If they were operating in region where a big snowstorm was forecast, they might find they were out of snow shovels and initiate a restocking.
“We had extensive experience in cleaning and loading widely divergent data sets,” noted John Thompson, vice president of worldwide marketing. “It was a hidden competency of the company” and so routine that no one thought of it as a business service. “But there are not a lot of people who can do it well,” he adds.
Instead of dealing with retail data, however, WhiteCross is ready to upload your Web site activity data. Its data warehousing and data mining systems analyze data taken from Web server logs and other server logs, scrub it and load it. “We take out the disallowed traffic–from Web crawlers, internal employee activity, and robot pingers,” he noted.
By examining the traffic that reflects real user activity, WhiteCross can look for patterns and trends beneath the surface. It can then link traffic to marketing campaigns, checking how many visitors came from linked sites or addresses where the company has click-through ads.
The company was founded in 1992 in an attempt to sell data warehousing systems on the massively parallel machines that it built. One of the machines it uses for customer data analysis has 556 processors, Thompson noted. Data mining lends itself to parallel processing because a large database can be divided up, with each processor addressing one section of it. When a query is submitted, many processors work on it simultaneously.
WhiteCross needed a lot of data mining expertise just to sell the machines, and a lot of follow up services to help customers implement them. It has evolved into selling its Web Analytics service, instead of the hardware. The leading broadband service company in the UK, NTL, uses WhiteCross to analyze its telephone and dial-up customer data.
The Web Analytics service helps NTL to “differentiate packages to suit each customers calling patterns,” said Fraser Hopewell, head of telephony at NTL. Such deals typically run $35,000 a month or more, said Thompson. WhiteCross can answer such questions as which customers use the phone the most during business hours, or which customers call Australia more frequently than their neighbors.
Another customer is Sprint, which accumulates a terabyte of customer data every six days, said Thompson.
WhiteCross offers a low end Web Analytics service for $25,000 a month that captures, scrubs and does preliminary analysis on data. Further analysis can be done by WhiteCrosss Data Exploration Server, returning detailed reports, or the customer may purchase the software and execute data analysis in-house, Thompson said.