How to Improve Microsoft Exchange Data Protection
E-mail systems in general, and Microsoft Exchange in particular, have moved well beyond being an important application in today's enterprise. Messaging is now a mission-critical application for most enterprises--as important as a dial tone when it comes to communication. As a result, Microsoft Exchange is storing much more than the text of messages; it stores files and multiple versions of files, as well as calendar and contact information. In many cases, the last known good copy is stored within the Microsoft Exchange environment.
Today's data protection and management tools for Exchange have not kept pace. As a result, there are many point solutions with separate data stores and management interfaces entering the market to help fill the gap left by the traditional solutions. The challenge is, while many of these solutions add value and make life easier for the Exchange or backup administrator, they add further complexity to the already-complex problem of data protection. This is because they add separate subprocesses that require special care and feeding to make sure that they are working. These stand-alone utilities must then be integrated back into the backup application, which is again another step to be monitored for failure.
The new requirements for Exchange involve not only improving the backup process but also improving the recovery and retention capabilities of the Exchange environment. For these requirements to be truly effective, there must be a much higher level of integration than in the past, so as to minimize the current workload of the administrator.
Most e-mail systems in a large corporation receive thousands, if not hundreds of thousands, of e-mails throughout the day, every day. Yet most Exchange data protection efforts are a once-per-day occurrence. This simply is not granular enough; there must be multiple protection instances throughout the day.
There are SAN-based snapshots that have Exchange integration to quiesce the Exchange store to get a clean snapshot, but even those are not often performed in a granular enough timeframe. Most often, they are performed a few times per day. Also, SAN-based snapshots are dependent on the primary system being up and running. While they can protect from a data corruption issue, they cannot protect from a storage system failure.
New solutions from companies such as InMage or Syncsort that can provide continuous or near-continuous (once-per-hour) protection of the Exchange environment are answering the call for improved Exchange availability. This provides block-level backups of the Exchange environment so that the impact on the server is minimal and the amount of data transferred across the network to the backup destination is small. These block-level backups are sent across the network to a separate storage device so that they are immune to a failure in the primary storage array. Syncsort has the added benefit of this process being integrated into the rest of the process. Typically, an Exchange image can be updated within 5 minutes of the backup job being completed.
Once this data arrives at the backup target, it is then snapshotted, providing versioning. Since only the block needs to reflect the changes between each backup, weeks' or even months' worth of Exchange backups can be stored on disk. Integrating to tape or to some other secondary storage device is critical when dealing with Exchange data. Long-term retention of e-mail is going to be increasingly important. Having the ability to move these backups to a less expensive, long-term storage medium is critical.
The requirements for recovery of Exchange are similar to backup: speed and granularity. A storage system failure or a corruption of the data set is when speed to recover is most important. And, no matter the size of your Exchange environment, this can be a time-consuming process. Despite all the data reduction technologies available today, in a recovery, all the data has to be sent back across the network. The network is a bottleneck. All this data has to be written to disk, and RAID parity has to be recalculated. Disk writes are a bottleneck.
The best solution is not to have to move data at all. Ideally, you would want to mount it directly off of the backup target. Because a block-level, incremental backup system stores data on the disk backup target in its native format, it can also provide the ability to mount that data directly. These systems can create a read/write snapshot of the backup image and serve that up via iSCSI to the Exchange Server. The Exchange Server can be brought back online and can connect to this active data set in mere minutes, while the older data is restored back across the network.
The second, more common recovery request made is to have an individual contact, mail message, attachment or calendar entry restored to the Exchange store. In the past, this required special time-consuming backups, often called brick-level backups. These did not end up being widely deployed because of the speed at which these special backups could be performed, and due to the speed at which they could fulfill a recovery request. Often, after some initial testing, the module was never used or only used for a few mailboxes. Since most customers did not use brick-level backups, the real-world solution was a painful process of recovering to a standby Exchange Server and then manually pulling out the mailboxes required by the user.
The Active Target capability of block-level incremental backups provides the ability to have a view of the backup images in real time, outside of the Exchange environment. This allows for a stand-alone utility to search the different backup versions of the environment for a specific message or attachments and, once found, they can be instantly restored back into the Active Exchange environment. From the users' perspective, they see no performance impact; the recovered message just reappears back in their in-box. This capability can be used to restore entire mailboxes, individual messages, individual contacts or calendar entries.
Data Retention and Search
Ninety-five percent of legal discovery requests now involve e-mail. By combining the capabilities discussed above, an effective discovery response system can be devised. First, the key component of effective storage use is there because only changed blocks are stored on the backup disk. This results in slower disk space consumption, allowing for longer storage of e-mail backups. If the solution can provide different retention policies to be set on different data types, backups of other data can be sent to tape much sooner, optimizing disk capacity for the longer retention requirements of the Exchange data.
Second, with an interface into this backup data as if it were a live Exchange environment, queries can be built based on keywords or users and date ranges. This data can then be exported out of the environment as a PST file to be sent to the opposing legal counsel for review. The whole process would take minutes as opposed to the hours involved in standing up a second Exchange Server, doing the restores to it and then trying to search for the data. This solution provides much of the retention and search requirements that e-mail archive solutions can offer, without the expense of purchasing a stand-alone system.
Today, Microsoft Exchange has become a mission-critical operation. When something goes wrong, it must be recovered as quickly as possible with as minimal data loss as possible. These new solutions deliver these capabilities, plus the added benefit of meeting most legal retention and search requirements.
George Crump is the founder of Storage Switzerland, an analyst firm focused on the virtualization and storage marketplaces. An industry veteran of over 25 years, he has held engineering and sales positions at various IT industry manufacturers and integrators. Prior to founding Storage Switzerland, George was CTO at one of the nation's largest integrators. He can be reached at firstname.lastname@example.org.