Avokias Cluster: the Most Far-Flung?

The live, active-active setup links San Francisco, Toronto.

Avokia is claiming to have strung out the most far-flung database cluster ever with its user, Espressocode. The cluster runs between Toronto and San Francisco, or 2,266 miles.

Espressocode, a maker of software for the freight and customs industries, is using ApLive technology, which Avokia rolled out at Demo in Phoenix in February, to pull IBM DB2 databases together in this far-flung, active and load-balanced cluster in its multisite environment.

ApLive provides redundancy and backup to mission-critical applications by clustering, replicating and load balancing virtualized databases. The databases can be geographically dispersed. Oracles RAC (Real Application Clusters) can do similar work, but only on LANs. To cover far-flung locations, RAC needs a helping hand from Oracles Stream and Data Guard software products, which provide active or passive failover between sites in a WAN.

IBM also offers a product, HADR (High Availability Disaster Recovery), that provides a high level of availability if a second node is located in the same site. HADR offers disaster recovery if the second node is located in a remote site. According to Alan Kriss, Avokias director of marketing, that doesnt help Espressocode with its scalability needs, since HADR is limited to two nodes, and the backup node is not available for reporting purposes.

"More standard replication products are also available with DB2," said Kriss. "Those would provide offline copies of the production database [that are] useful for reporting but not for high availability or load balancing by [Espressocodes] online or production application, Exdocs."

Alan McMillan, CEO of Toronto-based Avokia, told eWeek that his company works with Espressocode to provide the middleware software, which fits in at the application layer to virtualize the database layer. McMillan said the middleware provides 24/7 support to Espressocode users.

Regarding the difference between RAC and ApLive, McMillan said that, with RAC, "Youll be down while Data Guard recovers."

Thats because ApLive replicates at the SQL statement level, McMillan said. "Its the write statement," he said. "When youre accessing data out of the database, youre grabbing it in the read state. Our technologys smart enough to know it only needs to replicate changes to remote databases. Its 1/1,000th of the size with which typical replication technology works. Because its so much smaller, it can fly faster through the Internet."

This SQL statement activity contrasts with other technologies that replicate the database log file between data centers that are typically located about 30 miles apart, McMillan said.

"Disasters are often greater than 30 miles when were talking about hurricanes, the power outage in California or terrorist actions," he said. "Now you can have live-live data centers across the country."

Having active instead of passive backup data centers also means that users, in effect, get twice the work out of their data center infrastructure, McMillan said. "Instead of having a second data center on standby, waiting to be used, ours can be used at any time. Its not just insurance gear waiting to be used."

Andreas Antonopoulos, an analyst at Nemertes Research, in Frankfort, Ill., said that virtualizing the database provides the performance benefits of clusters, along with the ability to span large distances and the centralized load balancing of virtualized databases.

Regarding latency concerns, Antonopoulos said any downsides are "more than compensated by the flexibility and recoverability offered by database virtualization solutions."

However, the changeover to virtualized, load-balanced databases creates a need for solid planning in terms of physical distance and network optimization to reduce latency, he said.

"There are significant difficulties in extending or synchronizing databases across great distances," Antonopoulos said. "Distances of more than 50 to 100 [kilometers] are often reported as the upper limit for synchronous replication of storage and data. Greater distances create synchronization and concurrency technology challenges."

"IT executives are struggling to balance high demands for availability, compliance mandates for geographical separation and latency issues," Antonopoulos said. "Companies offering solutions that can replicate or virtualize databases over great distances are in a growing market."

One Avokia competitor in that growing market is Continuent, formerly known as Emic Networks, which started out as a provider of clustering for MySQL databases and Apache Web servers but now handles PostgreSQL, SQL Server, Sybase and Oracle databases.

Continuent offers what it calls a database-neutral solution in either an open-source or a commercial flavor. Like Avokia, Continuent said its solution eliminates single points of failure.