Data Underdogs

 
 
By eweek  |  Posted 2001-08-06
 
 
 

While open source databases have scored some recent successes, they wont pose a serious challenge to mature commercial databases until they have added some essential reliability and management features.

Vendors offering open source databases say that they are working hard to close the gap — and they predict that their databases will someday become as prevalent as the Linux operating system (OS) and the Apache Web Server.

"Open source databases today are where Linux was three years ago and where Apache was five years ago," says Michael Evans, database product manager of Red Hat, a Linux distributor that offers a package consisting of the open source PostgreSQL relational database and the Linux OS.

Apache is the dominant Web server on the Internet, running 63 percent of sites in June 2001, according to Netcraft, a U.K. Web server tracker. Linux, meanwhile, is the fastest-growing server OS in terms of market share, according to research firm IDCs estimates, though it still lags Windows by a wide margin. But mirroring the success of Apache and Linux will be no small feat for the three most popular open source databases — InterBase, MySQL and PostgreSQL — which combined represent less than 3 percent of the market, according to even the most optimistic estimates of the suppliers themselves.

Commercial database vendors say that their open source cousins may never catch up. Open source systems "are a great way for our future customers to learn about relational databases," says Bob Shimp, Oracles senior director of database marketing. But as to whether open source databases will achieve the ubiquity of Apache or Linux, he says: "Databases are dramatically more complicated than any Web server or operating system technology."

Open source vendors counter that commercial databases heft and complexity are part of the problem.

"Oracle and Microsoft continue to battle over enterprise features. Theyre adding all that complexity . . . Its a slippery slope," says Britt Johnston, chief technology officer of NuSphere, a distributor of the MySQL open source database system. Most of the users that NuSphere seeks to reach do not need such overly complex features, he says.

Much of the appeal of open source databases is that vendors of InterBase, MySQL and PostgreSQL do not charge as much as proprietary database vendors. Evans says that Red Hats $2,295 price tag for PostgreSQL would look very attractive to many Oracle customers, who must pay $15,000 per processor for Oracle9i Standard Edition.

But cost aside, the traditional databases supposedly complex and unnecessary features are among their major selling points. Oracle says that it provides scalable clustering with its Oracle9i database, so that when another computer is added, much of the new nodes processing power is available for the work at hand. Open source databases cant manage clusters that way yet, Oracle says.

Furthermore, IBMs DB2, Microsoft SQL Server and Oracle9i are able to perform online backups without interrupting regular operations and can replicate data to other databases across a network — advanced features that todays open source databases lack.

Work in Progress

Open source database vendors concede that they have some work to do on these fronts. The online backup feature of NuSpheres $299 MySQL Advantage "is not quite as transparent as it needs to be," Johnston says. What he means is that a database user sees a pause in the system when it performs a backup operation, a shortcoming that will be eliminated "in the next month or so," he says. Another feature that NuSphere is adding to MySQL is replication, in which data on one system is reproduced throughout a set of distributed systems.

MySQL currently lacks the capability to execute subtransactions — the ability to work with a subset of data from a primary set in a query, such as a defined group of people within an "employees" category. On the other hand, MySQL developers now have a much-needed transaction management system: NuSphere last month made its Gemini transaction manager for MySQL available as open source code on mySQL.org, a site that the company recently launched. Complicating matters, though, is NuSpheres blood feud with MySQL AB, a Swedish company that runs a competing open source development site for MySQL code at www.mysql.com.

Meanwhile, the open source database PostgreSQL — available from Red Hat and Great Bridge — has a longer track record than MySQL. Originally developed by University of California at Berkeley database researcher Michael Stonebreaker, PostgreSQL has a mature transaction management system. It also has a sophisticated data-locking mechanism, called multiversioning, which gives someone read-only access to data even though it may be in use.

But PostgreSQL also has limitations. For instance, one gap in Great Bridges PostgreSQL 7.1 is its associated language for developing Structured Query Language applications, PL/pgSQL. The language resembles Oracles PL/SQL (Procedural Language/SQL), except that PL/pgSQL offers the use of functions only, not procedures. A function call always returns some result, while a procedure may execute certain operations without returning a result. In some cases, database developers prefer using procedures and "stored procedures," so Great Bridge is planning to add the capability in a future version of its PostgreSQL, says Mark Cotton, Great Bridges vice president of consulting services.

PostgreSQL, like MySQL, also doesnt provide any native replication capability. Great Bridge developers are adding sophisticated replication to PostgreSQL, and it will be available by mid-2002 or sooner, Cotton says.

Besides not having certain advanced features, open source databases are sometimes faulted for their lack of supporting software — particularly development tools — which is closely integrated with the database engines of DB2, Microsoft SQL Server and Oracle9i. Then there is the commercial vendors traditional put down of the open source community: that its members are essentially amateurs. "Oracle has thousands of the best database engineers within a five-minute walk of each other," Shimp says, keeping Oracle ahead of "any loose, collaborative group of tinkerers."

But some Web businesses are finding that they can function perfectly well with open source databases. The Wireless Developer Network, a site for wireless communications professionals, started out using Microsoft SQL Server, and GeoCommunity, which is oriented toward geographic information specialists, was launched using Oracle. Both found that they could change to PostgreSQL without impairing their operations.

Says Red Hats Evans: "Oracle and DB2 are overkill for a lot of database applications and workgroups."

Such examples may not be enough to impel the wide embrace of open source databases. Both Apache and Linux had the support of marquee users, but they were only accepted on a broader basis when Hewlett-Packard, IBM and other established vendors started supporting them and offering them in their product lines.

IBM captured the industrys attention with its boast that it would spend $1 billion to move Linux to its hardware lines. Installing a low-cost OS on its mainframe and other big servers has been a boon to hardware sales, IBM says. But open source observers note that Big Blue doesnt have the same business imperative to back open source databases when its own DB2 is locked in a struggle with Oracle to become the market leader.

Nevertheless, open source database vendors believe that they dont need a big-name supporter to succeed — nor do they need to match commercial systems feature-for-feature to stay in the game, they say.

"We focus on the features that the vast majority of developers need," NuSpheres Johnston says. "We offer lighter-weight processing and a smaller footprint — which provides 80 percent of what most people need — not the esoteric features too complicated to understand."

For now, however, its uncertain whether that sales pitch will win many converts away from more established database players.

Rocket Fuel