For CIOs such as Priceline.com Inc.s Ron Rose, few things are as important as having a Web site that is always on. Leveraging database replication and testing tools from Quest Software Inc., Rose and his staff have been able to boost the availability and scalability of the Priceline.com site.
Rose eschewed the clustering environments popular with e-commerce sites for a solution that scales horizontally, built on database replication. This has allowed the travel service site to improve performance and functionality, as well as to maintain close to 99.99 percent availability, since upgrading the Quest solution last year.
“The key design element was to be able to do rolling upgrades of our database infrastructure,” Rose said. “The clustering approaches are very good at creating additional scalability and creating some resilience, but for very high availability you need a replication approach. Our design stretch target was four nines availability.”
To attain the high-speed replication needed to keep up with the sites heavy workloads, Rose and his staff used Quest Softwares SharePlex for Oracle database replication solution. The SharePlex system is also used for disaster recovery and business continuity, ensuring that database servers at Priceline.coms two physical sites are load-balanced.
The Priceline.com database infrastructure is built on Sun Microsystems Inc. servers and storage systems running an Oracle Corp. Oracle database. The Quest tools are used to replicate data and monitor performance.
“SharePlex provided us with high-speed replication,” said Rose. “Oracle had forms of replication that were very good, but the Quest tool was more lightweight and could support greater velocities for [database] inserts, based on our estimations.”
Using this replication technology, Rose and his staff were able to set up multiple replicas on secondary servers to share the load with the primary database servers.
Priceline.com has been using SharePlex technology since 2000, said Rose, but SharePlex upgrades developed in mid-2004 have allowed the Norwalk, Conn., company to offer real-time response for its Name Your Own Price feature. Response time before the SharePlex upgrade, in contrast, took about an hour.
Further, with the SharePlex system, Priceline.com can complete an average of 300 to 400 transactions per second and—to accommodate occasional bursts in traffic—up to 1,000 transactions per second, said Rose.
SharePlexs use of replicas has allowed Rose to implement secondary and tertiary database servers that can fill in if a primary server goes down.
More important, Rose said, since each replica functions independently within the environment, he and his staff can run tests and apply patches on replica servers without disrupting the core business.
In addition, because patches and software upgrades can be rolled out incrementally from one replica to another, Priceline.com has all but eliminated the need for downtime due to maintenance.
“Many people dont include their maintenance windows when they talk about their downtime, but we do,” said Rose. “Our database availability [since the SharePlex upgrade was implemented] has been competitive with mainframe databases. We do upgrades regularly, where we take secondaries out of the hunt group, apply patches and bring the servers back online without any downtime.”
Testing, monitoring
In addition to replication, Priceline.com relies on database performance testing and monitoring tools to keep its availability numbers up, said Rose. Through testing, Priceline.com can find flaws in SQL code before they lead to outages, and the information unearthed during testing helps developers tidy up code.
Rose said he and his staff have often observed that database problems are self-inflicted, caused by underperforming servers. They decided they needed a way to detect and diagnose issues in a real-time environment before the issues became problems. Testing is a core part of Priceline.coms “win by not losing” availability philosophy, said Rose.
Rose and his staff again turned to Quest Software, and its Quest Central tool, to proactively test and monitor the health of database servers.
In the companys stress labs, Priceline.com IT staffers can find potential problems and bottlenecks before new code is sent into production. In addition, when a server runs abnormally slow, Rose and his staff examine the server and try to detect any changes that might have been made in the days leading up to the slowdown. Using Quest Central, the test lab can find out why databases slow down and can help database developers see firsthand the consequences of writing bad SQL code.
“If we werent using these tools,” said Rose, “we would just be paying more in Oracle licenses and CPUs to keep up with our workload since the developer code would be inefficient.”
Senior Analyst Henry Baltazar can be reached at [email protected].
Check out eWEEK.coms for the latest database news, reviews and analysis.