A backup and recovery strategy should address both data growth and security issues. It should provide a similar level of service for all platforms, enabling a single database administrator to control both backup and recovery operations and the backup files themselves. Finally, the backup and recovery strategy should maintain data security and minimize storage needs, backup times and recovery times.
Here are seven steps to consider as you build a sound database backup and recovery strategy:
1. Never back up databases to local disk
We back up SQL Servers so we can restore data in the event of a server crash. If the SQL Server crashes, especially if there is a hardware or severe operating system failure, you are going to need to be able to restore as fast as possible and the local drives may not be available. It’s much safer to back up to a network share, which will allow you to begin restoring your backups to another server immediately and get back online faster.
2. After you back up the databases to a file share, back up the share to tape
Tape drives these days are fast enough that vendors like to say DBAs should go straight to tape. Technically, they’re right: tape backup and restore speed is not a bottleneck. However, a limited number of drives are available and, when the DBA needs a restore right away, the tape drives aren’t always just sitting idle. Disk backups, on the other hand, are always available.
3. Justify the cost of the network share by lower licensing costs and simpler backups
The storage area network (SAN) and backup administrators will need to have cost justification for a new dedicated array just for SQL backups. Here it is: the dedicated array will pay for itself by eliminating the need for backup agents on each SQL Server. It will allow them to have just one backup policy: back up everything on that network share once a day and get it off-site ASAP. They won’t have to worry about peak loads or which servers are on which schedules; they’ll only need to back up that one network share.
Do Regular Fire Drill Rebuilds and Restores
4. Do regular fire drill rebuilds and restores
Do a restore from tape at least once a quarter. Treat it like a real restore need, as though a developer lost something and needs it back from tape. Then, restore it.
5. Keep management informed on restore time estimates
Every quarter, I look at the backup sizes and speeds and estimate how long it will take to restore a server from tape. I put that information into a written report to management that includes, for each server, a cost estimate for decreasing that outage window. Sometimes, the answer is third-party backup compression software; other times it may be more CPU power for the server or adding more memory. The primary goal is to give management enough information to make a good business decision about whether the method for reducing the outage window is worth the money. The secondary goal is to have documentation so, when a restore takes too long, you can show why it may be time to revisit a previous business decision.
6. Trust no one
Once a month, bring an external FireWire or USB into the office, hook it up to the server hosting the network share, and copy the entire backup share to that drive. This will take some time but that’s okay. In today’s security-conscious world, it’s a good idea to then put it in a bank safe deposit box. Nobody should ever need to do this, but sometimes even the best backup administrators make mistakes.
7. Have solid backup and disaster recovery plans in place before you move to the cloud
It can be dangerous to put all your data in someone else’s hands without having your own backup and recovery plan in place. It’s possible for everything to be temporarily lost from a hosted service. If you don’t have your own backup plan to protect your organization’s data, a cloud service’s problems with data loss can result not just in annoying inconvenience for users, but it can have serious consequences for the business when business-critical information disappears. Don’t just assume your data is safe in the cloud-back it up yourself.
Database backup and recovery is becoming more difficult and complicated, especially with today’s explosion of data growth and the continually increasing need for data security. These steps do not represent an exhaustive list, but they are a good start to forming a sound database backup and recovery strategy. Most importantly, they allow for a single DBA to perform a complete backup and recovery operation, minimizing the chance of error that can occur when multiple people are involved in the process.
Brent Ozar is Quest Software’s SQL Server domain expert and a Microsoft SQL Server MVP. Brent has a decade of broad IT experience that includes systems administration, project management and database administration. In his current role, Brent specializes in database performance tuning, SANs and data warehousing. Previously, Brent spent six years at UniFocus, a hospitality metrics company. Brent conducts training sessions, has written several technical articles, and blogs at www.brentozar.com. He is a regular speaker at PASS events, editor-in-chief of SQLServerPedia.com, and co-author of the book, “Professional SQL Server 2008 Internals and Troubleshooting.” He can be reached at brent.ozar@quest.com.