When evaluating disaster recovery (DR) technologies, tactics and processes, organizations often perceive the investment as analogous to a life insurance policy: DR will allow the company to get back on its feet if a disaster strikes its primary data center. The analogy is not entirely wrong. After all, that is one of the functions DR technologies provide.
However, DR can be much more than life insurance. A better analogy, and one that is more likely to lead to judicious DR investment decisions, is health insurance. While DR technologies do help organizations recover data and operations after a calamity strikes, they can also provide the means to maintain the health of a company’s information systems on an ongoing basis.
Virtually all companies maintain some level of IT disaster preparedness. In many cases, particularly in small and medium-size businesses, this consists solely of backing up data to tapes nightly and shipping the tapes off-site. Then, if the worst happens, data and operations can be restored from the tapes at a different location if necessary.
This traditional tape-based approach is responsible for the life insurance metaphor. Because of the time and human resources required to recover data from tape, tape-based recovery is typically used only when there is no alternative. Thus, tape-based backups generally do offer little more than what is suggested by the life insurance metaphor-they normally pay out their benefit only when the worst happens.
Limitations of Tape-Based Recovery
Limitations of tape-based disaster recovery
Although the “premium” of tape-based DR is low enough so every company can afford it, this life insurance policy does little to protect the ongoing health of a business. First of all, tape-based backups don’t offer complete protection. Data added or updated since the last backup tape was created will not be recoverable if a disaster destroys both the production database and all on-site journals. If the most recent backup tapes are still on-site when a disaster strikes, they may be destroyed as well, forcing the organization to recover from tapes that may be as much as a week old.
Furthermore, recovering a full data center from tapes can take from several hours up to a few days, which does not meet most organizations’ business continuity objectives. In fact, that long of an outage would doom some companies.
Tape-based backups are particularly problematic for organizations that require 24×7 system availability because, in general, applications must be stopped while the data they use is backed up. Even when it is technically possible to “save-while-active,” it is often not practical. That’s because backup jobs put such heavy demands on disk I/O channels and CPU resources that applications may slow to a crawl while the jobs are running.
Possibly even more important, tape-based DR does not protect the continuity of operations during frequent maintenance operations, such as hardware and software upgrades. Thus, traditional tape-based backups don’t even do a particularly good job of fulfilling their life insurance role.
Another failing of tapes is that they can be used to recover data to only a particular point in time, namely the point at which the tape was created. This typically occurs just once every 24 hours and usually at night. If a data item is corrupted in the middle of the day, as when an operator accidentally deletes a file or a computer virus destroys some data, the best that tape can offer is to restore the data to its state as of the previous night. If the data was updated during the day, those updates will be lost and will have to be recreated manually if possible.
Preventive Medicine
Preventive medicine
Given enough time, data-destroying calamities will happen but they don’t have to impair the availability of data or applications in any significant way. Thus, the ultimate in DR arises when companies do not have to recover from a disaster in the traditional sense at all-and that’s more than just wishful thinking.
Advanced DR strategies, tactics and technologies break the life insurance mold and deliver something more akin to health insurance. They do so by providing not only the ability to recover from a disaster, but also the ability to overcome or even avoid many problems that threaten operations much more frequently than disasters.
Modern high availability (HA) software can replicate, in real-time, all production applications and data (including system data) to create and maintain hot-standby backup servers. Then, when the primary server is unavailable-regardless of whether that is due to a disaster or simply planned maintenance-users can be switched to the backup server. If a sufficient distance separates the two servers, a disaster that strikes one will not affect the other.
A DR environment which uses redundant servers and replicated data and applications makes it possible to keep operations running healthy under almost any circumstances. If, for example, a disaster strikes, production hardware or software needs to be upgraded or a database needs to be reorganized, users can be switched to the backup server with little or no downtime.
This DR approach is somewhat of a departure from classic AIX HA, which typically depends on shared disks. In a shared disk environment, replication and geographic separation are not necessarily inherent components, and that means the solution may not provide protection against disasters.
Guaranteed Benefits
Guaranteed benefits
A “health insurance” perspective turns the economics of DR on its head. It is no longer necessary to justify a DR investment based on an unpredictable event-a disaster-that is unlikely to happen in any given year. Instead, the benefits are guaranteed to accrue every time it is necessary to perform maintenance that would otherwise shut down operations.
However, while a HA solution may be the ultimate in DR, it does not provide a way to recover from data corruption or accidental deletion. The problem is that the HA replicator will likely immediately replicate any data corruption or accidental deletion to the backup system. The result is that there will be two copies of the corrupted data and no copies of the data before the corruption or deletion.
Continuous Data Protection (CDP), which may be included in an AIX HA solution or implemented as a separate product, provides the missing piece of the puzzle. Rather than maintaining a replica server, it saves individual data updates. Those updates can then be used to restore an individual data item to a particular point in time should the need arise.
DR is too complex a topic to discuss in its entirety here. The point is that it is much easier to justify an investment in full-spectrum business continuity “health insurance” than the costs of basic disaster recovery “life insurance.”
Rich Krause is a Senior Product Marketing Manager at Vision Solutions. Rich has more than 20 years of experience in product marketing, product management and product development in the enterprise software marketplace. Rich is a frequent speaker and writer on topics related to data protection and business continuity. Prior to joining Vision Solutions, Rich was director of technical services for Clear Communications. He can be reached at Richard.Krause@visionsolutions.com.