Enterprises have used tape as a backup medium for as long as they have been using computers. Apart from storing hard-copy source documents, tape backup was initially the only data protection mechanism available. Many organizations still rely exclusively on paper and tape backups today but that needs to change.
First, paper documents are no longer a recovery option because many transactions are fully electronic. Second, as 24×7 operations become more prevalent, and as regulators and other stakeholders require more stringent safeguarding of critical data, relying solely on tape is no longer sufficient. Finally, paper and tape provide inadequate recovery options in an age when systems connect to the world and are highly vulnerable to attack. Today, seamless data protection and instant recovery are absolute necessities.
Despite great increases in tape speeds, tape drives are still slow compared to disks. Recovering data and applications after a disaster can take several hours or even days, particularly if the tapes must be retrieved from an offsite vault. In addition, although tape recovery is much more automated than in the past, it still requires some manual processing that can result in errors and delays. Moreover, long recovery times mean that tape backups cannot serve to keep operations running during brief maintenance shutdowns.
More limitations of tape backup
Another problem with tape backups is that they allow the recovery of data to only one point in time, usually sometime during the previous night. Updates applied later are lost if a disaster destroys the data center. Worse, if last night’s tape has not yet been shipped offsite, it may be destroyed as well, forcing the company to revert to an even older backup version. The same will be true if the most recent tape is unreadable. In addition, if a virus attacked the systems at, say, 4:03p.m., one might want to recover data to its state at 4:02p.m. But since tapes allow restoration of data only to the point in time when the backup was created, they do not provide that option.
At one time, operators had to stop production applications while running a backup job. Today, backup-while-active features theoretically allow online applications to function during backup operations but that is often not practical. Transactions caught in mid-flight during a backup operation may corrupt the data on the tape. Thus, half of a transaction may be saved but a balancing debit or credit may be omitted. Furthermore, because backup jobs grab every byte of data on the disks as quickly as possible, online applications may slow almost to a stop when backups are running. This wasn’t an issue when backups could run during “off hours.”
Today, when e-commerce and global supply chains require around-the-clock availability, running tape backup jobs on a production server is unacceptable at any time.
Benefits of High Availability Technology
Benefits of high availability technology
High availability (HA) technology, which replicates data and objects to a secondary server (replica server) in real time, avoids the pitfalls of tape-based backups. It has the following four benefits:
Benefit #1: All data is continuously backed up. Consequently, there is no “orphan data” (that is, updates applied since the last backup job and therefore not yet backed up).
Benefit #2: HA software replicates data and objects in the background with no need for operator intervention.
Benefit #3: Some HA products can detect when the primary server is unavailable and automatically fail over to the backup server. They can also automate a manually-initiated switchover to accommodate maintenance.
Benefit #4: A geographically remote replica server can serve as a disaster recovery solution, although the phrase “disaster recovery” is misleading as there is no need to recover anything. Instead, users are simply switched to the remote replica server.
In organizations that use HA technology but choose to continue to produce backup tapes as a failsafe option, the backup jobs can be run on the replica server, eliminating any impact on production operations.
Limitations of High Availability Technology
Limitations of high availability technology
HA is often perceived as the ultimate in data protection, particularly when the primary and backup servers are separated geographically. But there is a gap in this protection. The types of threats that HA and tape backup solutions address are not the only data risks companies face today.
For example, if someone accidentally deletes a file, the HA replicator will immediately repeat the deletion on the backup server. Likewise, if a virus or a malicious individual alters data in a way that misrepresents the truth (while still adhering to all the rules enforced for that field by the system), the replicator will blindly replicate the destructive update to the backup server. In these and similar cases, there should be a way to rapidly revert to the prior state of the data. HA software alone does not offer this option.
Tape-based backups are not a solution to this problem. Although they make it possible to revert to the state of the data at the time of the last backup job, that moment might be well before the desired recovery point. The database’s journaling feature might allow a rolling forward to a point immediately before the data was corrupted, but that is typically a time-consuming, labor-intensive and error-prone process.
Continuous data protection offers another option
The solution is Continuous Data Protection (CDP), which is available in both standalone products and as a feature in some HA products. While various CDP technologies differ in their specific features and options, viewed from a high level, CDP comes in two flavors: True CDP and Near CDP. True CDP captures all data writes, transfers them to a secondary disk and stores each update independently. Using True CDP, one can “undo” data updates and additions by recovering the data to any point in time.
Near CDP differs from True CDP in that one can recover only to specific points in time. For example, Near CDP may copy data to the secondary disk only when a file is saved or closed. Two benefits of this approach are that it reduces the amount of data transmitted and every available recovery point is “clean” (that is, no transactions will be caught in mid-flight). The drawback is that, in some cases, this can result in recovery points being spaced at intervals of several hours or more. This may not be adequate in many environments.
It’s best to work together
The optimum choice of tape backup, HA or CDP options is often “all of the above.” Tape provides a reliable off-site backup should all else fail. Combining tape with HA and CDP goes further and addresses all data corruption and availability issues. HA ensures that operations will not be interrupted by disasters or maintenance activities, without the need to perform cumbersome, time-consuming recovery operations after a disaster. Finally, CDP enables the quick restoration of data to a point in time prior to its corruption. Thus, the combination of tape-based backups, HA and CDP is the best way to provide seamless protection against data loss, data corruption and system downtime.