How to Integrate Large-Scale Databases with Perl

Organizations dealing with large data sets should be looking at mature and robust solutions to handle their data. Though so-called NoSQL databases have been gaining popularity with some organizations facing scaling problems on the Web, they're still new and largely untested. Perl can work with NoSQL databases, but if your problems are more traditional, Knowledge Center contributor Jeff Hobbs explains here why Perl and a mature relational DBMS is your best solution.


It's all too tempting today to always look to the latest technologies as a way to solve problems, without looking at tried-and-true methods that have been working for years. There's nothing wrong with adopting new tools, of course, but there's a tendency to throw the technological baby out with the bathwater because of the perception that new equals better. Sometimes it does, but when working with mission-critical systems and data, it's quite likely that the mature solutions are going to do better by you.

What I have learned is that Perl is one of the best ways to tackle integration of large-scale databases. Perl, while no longer the hot new thing, is mature and still thriving. Perl 5 has a long history of successfully working with open-source and commercial relational DBMS (RDBMS) such as MySQL, PostgreSQL, Oracle, SQL Server and many others. Thanks to that long history, I have learned a few things you ought to know.

Use the right tools

A common mistake is for developers to reinvent the wheel when working with databases. Don't! Use the standard DBI driver modules that come with Perl to connect to your database. They've been tested hard, probably by companies with even larger data sets and more traffic to and from the database.

Perl's DBD::* modules provide a standard database interface that defines methods, variables and conventions that is consistent. This means that working with databases is not only well-documented and tested, it also gives great flexibility down the road. You might build your application on MySQL for testing but deploy on PostgreSQL or Oracle. You might need to migrate away from SQL Server at some point. Perl's DBI module lets you avoid lock-in on the application side.

Another tip for using the right tools is to use package managers-rather than the Comprehensive Perl Archive Network (CPAN) module-to manage your Perl modules. There are package managers available that offer a good way to manage binary modules without having to build from CPAN for updates. If you're using Perl from a Linux distribution, the best bet is to use the packaged Perl modules from the distribution so you're getting testing and updates from the Linux vendor.