License to Share
PostgreSQL vs. MySQL: How to Select the Right Open-Source Database
Traditionally, the comparison between PostgreSQL and MySQL gets heated from both sides of the religious camps. They are both technically strong, open-source databases with a large install base. However, in reality, rarely do the strengths of each database overlap to an extent where it is a toss-up of which solution to use.
Once people understand the strengths of each product, the choice is typically fairly clear. In order to properly evaluate the strengths of PostgreSQL and MySQL, let's look at the history and pedigree of each. Let's also look at their feature functionality and performance, as well as the requirements for a successful enterprise database deployment.
A historical perspective
The PostgreSQL community is the oldest, largest and fastest-growing community of its kind. PostgreSQL started in 1985 at the University of California, Berkley as an evolution of the Ingres project. After several years in academia, the PostgreSQL project was released into the open-source world. The community began to flourish to the point where, today, there are more than 1,000 contributors and over 30,000 members.
The key advantage to the large community is the interaction between users and developers, which allows users to become directly involved in the design of new features. This diverse community is the model that many other open-source communities strive for.
The MySQL community, on the other hand, started in the commercial world. MySQL was started in Sweden in 1994 out of a need to have a high-speed database behind Websites. It was released in the open-source world a few years later under the control of MySQL AB. This commercial control helped MySQL become one of the most widely-used databases in the world (which led to Sun acquiring MySQL AB in 2008). The popularity of MySQL also became a major factor in the EU acceptance of the Oracle acquisition of Sun, which finalized earlier this year.
PostgreSQL Features and Functionality
PostgreSQL features and functionality
PostgreSQL is billed as the most advanced open-source database in the market-a reputation it has built through decades of development. As a full-featured, open-source relational DBMS (RDBMS), PostgreSQL boasts many characteristics designed to support high-transaction, mission-critical applications.
The core strength of PostgreSQL is to safely and securely hold the data it manages. This starts with controlling connection access to the database through the use of enterprise authentication mechanisms such as LDAP or Kerberos. Once authenticated, all communication to the database can be over a Secure Sockets Layer (SSL) connection for highly secure environments.
When adding or modifying data, PostgreSQL enforces a number of constraints defined by the user to ensure data quality according to business rules. This ranges from simple check constraints to more complex foreign key checks. Once the data is stored on disk, the ability to back up and, more importantly, restore from a disaster is critical. PostgreSQL has a simple online backup facility that works in conjunction with a strong Point-In-Time Recovery (PITR) mechanism, providing administrators with the flexibility to quickly recover from a loss.
For example, the core architecture of PostgreSQL allows for other community groups to build more advanced features into PostgreSQL through add-on modules. A perfect example of this is PostgreSQL's geospatial support. This functionality comes from a module called PostGIS, a simple extension to PostgreSQL that arguably makes it the strongest spatial, open-source or commercial database.
Another extension to PostgreSQL is the ability to have many different types of stored procedure languages. This allows developers to build server-side code using the best language for their needs. For example, a trigger that needs to perform complex text processing can be written in Perl in order to utilize its strong regular expression functionality.
MySQL Features and Functionality
MySQL features and functionality
MySQL has the reputation for being the most popular open-source database-a reputation resulting from its legacy of performance and simplicity. From the onset, MySQL was designed to be a fast indexed sequential access method (ISAM) data store for Websites. This type of work load-which is characterized as a read-mostly load with many small queries-has led to features such as a query cache that improves MySQL's performance even further. This concentration on performance has inspired features such as MySQL Cluster, which allows the database to scale beyond a single physical server.
PostgreSQL is not the only open-source database that allows external extensions to add to the functionality of the database. One of the greatest strengths of MySQL is its pluggable storage engines. MyISAM, the default storage engine of MySQL, provides the performance for read-mostly environments, and the InnoDB storage engine provides the transaction robustness necessary for write-intensive applications.
Additionally, there are a number of third-party storage engines such as Brighthouse and DB2 that add even more capabilities to MySQL. This flexibility allows administrators to tune a MySQL instance based on the needs of the individual tables. For example, a read-mostly table such as a country code table can use a MyISAM storage engine, while a transactional table such as a sales order table can use InnoDB.
Although PostgreSQL and MySQL have distinct reputations for supporting specific types of applications, both databases are frequently used across a wide spectrum of applications. For example, PostgreSQL is known for its strength behind transactional enterprise applications, but it is also used to support many Web applications. Conversely, MySQL, the traditionally strong database for Web applications, is also used by applications requiring transactional support. The key is that either option has the flexibility to handle a large range of uses, but one may be a better technical or business choice based on individual circumstances.
Community is King
Community is king
The most striking difference between PostgreSQL and MySQL may not be technical at all. Open-source projects are, by definition, all about the community. The community of developers that create and contribute to open-source projects make them come to life. Essentially, there are two different kinds of open-source communities.
First, there are pure open-source database projects that are organized around independent and self-funding communities. PostgreSQL is the oldest and largest independent open-source database community of this kind. The benefit of such a community is true vendor independence.
On the other hand, the second kind of open-source community may have a vendor that controls the project and the project could potentially be "purchased." MySQL is one such example. The MySQL project was initially funded and controlled by MySQL AB, a commercial company that employed all the key developers and architects of MySQL. The MySQL community has more recently been controlled by Sun and now Oracle.
License to Share
License to share
Licensing, or how the source code is allowed to be modified and shared, can have a real impact on choosing an open-source database. The PostgreSQL license is modeled after the BSD license, which allows modifications of the code to be released into the open-source world at the author's discretion.
This open license is ideal for software vendors that want to use PostgreSQL as part of their solution. Since the PostgreSQL license does not force the derived solution to also become open source, the vendor can choose to open-source their code if or when it meets their business model.
MySQL is shared via GNU General Public License (GNU GPL) and controlled by Oracle. The GNU GPL is more liberal in its views of open-sourcing derived works in that it is in place to encourage the free sharing of code. This protects the original authors of the code by forcing new solutions based on the original project to take on the GNU GPL license as well.
While the comparisons of MySQL and PostgreSQL start with the fact that both are open-source relational databases, the similarities generally stop there. Each database has its own distinctive use cases, with only a small amount of overlap. In either case, they are high-quality databases that should be seriously considered over the more expensive proprietary databases-especially in light of today's tough economy.
Jim Mlodgenski is Chief Architect at EnterpriseDB. Jim is one of EnterpriseDB's first employees, having joined the company in May 2005. Over the years, Jim has been responsible for key activities such as sales engineering, professional services, strategic technology solutions delivery and customer education. Prior to joining EnterpriseDB, Jim was a partner and architect at Fusion Technologies, a technology services company. For nearly a decade, Jim developed early designs and concepts for Fusion's consulting projects, and specialized in Oracle application development, Web development and open-source information architectures. He can be reached at firstname.lastname@example.org.