The Processor Scavengers

Distributed computing startups feed CPU-hungry applications.

A hive of bees is always smarter and stronger than any single drone.

Thats the collectivist concept behind distributed computing, a model in which computing problems are parceled out over a network to individual computers for processing. Proponents of distributed computing posit that for certain kinds of tasks, a million relatively low-powered computers nibbling away at tiny pieces of the problem are a "virtual supercomputer," far more efficient than an actual supercomputer.

"A supercomputer of 10 years ago is in an Intel Pentium 4 chip today," says David Wilson, vice president of marketing and business development of United Devices, a distributed computing startup in Austin, Texas.

Distributed computing, which can be seen as part of the larger peer-to-peer computing universe, seems like a promising idea. It has won enthusiastic backers like Intel, which sees Internet-connected grids of PCs as a great use for its ever-faster microprocessors. In April, Intel launched a philanthropic program that lets Internet users donate their computing resources to institutions conducting research on cures for cancer and other serious diseases. Since then, more than 1 million people have downloaded Intels distributed computing agent, and their PCs at any given moment provide as much data processing capacity as todays 10 fastest supercomputers put together, according to Intel.

But such large-scale successes have been confined to philanthropic programs to aid not-for-profit research groups. Indeed, the original proof of concept for the nascent distributed computing industry is the [email protected] Project at the University of California at Berkeley, which uses screen savers on millions of volunteers computers to scan radio telescope data for signs of alien life.

Turning distributed computing into a business, and selling it to customers who have historically run their processing-intensive programs on multimillion-dollar mainframes, is a different matter. At least one startup, Popular Power, was forced to shut its doors earlier this year when it ran out of money.

Nevertheless, several distributed computing ventures, including DataSynapse, Entropia and United Devices, have started to find a real market — read: paying customers — among companies and research institutions in specialized fields that require copious amounts of computing power, like life sciences, financial services, and gas and oil exploration.

Andrew Chien, Entropias co-founder and chief technology officer, thinks distributed computing was generally oversold as a technology for mainstream computing.

"We continue to believe that distributed computing has very, very broad applications, but weve decided our primary focus is around companies that have computational needs that are clustered around biochemical modeling," he says. "That community has a huge demand for new computational power, and that makes them eager customers for distributed computing."

Its All About the Apps

One of the obstacles faced by the distributed computing firms is that — like operating system vendors — they need applications that run on their platforms to make them useful.

To that end, Entropia, which is focusing on pharmaceutical customers, has teamed with Turbo-Genomics, which specializes in high-performance bioinformatics software. Entropia has lined up pilot customers, including drug companies Bristol-Myers Squibb and Novartis.

Avaki, a startup in Cambridge, Mass., also is hoping to sell its distributed computing system to life sciences companies. Last week, Avaki rolled out the second version of its software, which enhances security and failure recovery features, and said it has received $16 million in funding from Polaris Venture Partners, General Catalyst and Sofinnova.

DataSynapse is aiming at different vertical industry segments, but its also focused on customer-specific applications. The New York company first courted the financial services industry, and its customers in this area include First Union bank. Now DataSynapse is expanding into the energy market with its newest product, LiveCluster, which is specially designed for data-intensive processing tasks. It is well-suited for oil and gas companies that need to process terabytes of data in looking for underground energy deposits, says Peter Lee, CEO of DataSynapse.

"The world of distributed computing is going to be very domain-specific," Lee says.

Meanwhile, United Devices recently signed a deal with Accelrys, which sells software to biotechnology and pharmaceutical companies, to combine Accelrys applications with United Devices MetaProcessor distributed computing system. By the end of the month, United Devices plans to release an updated version of its system, geared toward making it easier to manage. MetaProcessor 2.1 will have agent software that can run invisibly in Windows as a service and can be automatically deployed to desktops.

United Devices scored publicity as the company that provided the infrastructure for Intels first computing program, which has helped University of Oxford scientists screen 3.5 billion molecular models in researching cancer-fighting drugs.

Now it wants to evolve from a company with an interesting technology to one that is profitable. In August it raised $18.2 million in second-round funding, from AOL Time Warner Ventures, GE Equity and Intel Capital, among others.

Part of United Devices sales pitch is cost savings, a message it hopes resonates with organizations cutting back on capital IT spending. The United Devices system, which starts at $250 per desktop, is an inexpensive option to increase processor capacity by letting companies take advantage of the hardware they already own, Wilson says.

Entropias story is similar. According to its research, an average desktop PC sits idle 90 percent of the time — wasting thousands of hours of processing time. But Chien says that Entropias value isnt merely in providing high-performance computing cheaply. A drug company Entropia was working with had a simulation that would have taken 57 days to run. With a few thousand nodes connected via Entropias software agents, it could complete the same simulation in a few hours.

"Its a difference between not being able to do it, and doing it," Chien says. "You would never be able to go out and buy enough Sun [Microsystems] boxes to do that."