Wednesday, June 5, 2013

Thinking Out of the Box: Exchange 2013 and backup

What else do you want to do on a sunny Wednesday afternoon then to write an article about Exchange Server 2013 and backup ;). No really it was a pretty long time ago that I posted a useful article about Exchange so I thought, why not write something about backup.

Last weeks I received a lot of questions from colleagues and customers about backup and disaster recovery in the new Exchange Server 2013. These questions really seemed to focus on the fact that organizations still have a pretty old understanding about backup and recovery. All customers still want to have item level backup while their data usage is growing.

So I thought, this is a good opportunity to write an article about backup and disaster recovery (DR) with Exchange Server 2013 (Exchange).

Introduction

First of all you can divide backup primarily into two main concerns:

  1. You'll probably need backup to perform a point in time restore based on a single item or complete mailbox.
  2. In any enterprise production environment you'll need a solution that provides you a solution to recover your data in case of an emergency.

In the old days the solution to the first concern in Exchange was to buy and implement a backup solution that provided you single item backup and recovery. This feature enabled IT organizations within a company to restore a single or multiple items back into a user's mailbox in case the user accidently deleted the item.

The demand for this solution was high so everybody implemented it and performed well within the requirements. However a few years ago mail data demand began to grow and backup time windows began to shrink because of hypes like "The new way to work" and/or "Work/Life integration". These hypes created more flexible work times and therefore a shorter backup windows. Also users kept their e-mail into their mailbox until the end of times.

These developments began to create some challenges for IT organizations to handle backup of mail data within the boundaries of time provided.

When the years went by Microsoft optimized it's database structure and implemented new features in Exchange to cope with these problems. This resulted in even bigger mailbox databases, but the mindset of organizations concerning the backup of mail data did not change. Even today customers want to have single item backup in their Exchange environment. And when you ask the question, how many times did you use this functionality the past year, they can't give you a real answer.

The second concern is how you need to cope with outage and emergency and getting you're data back (Disaster Recovery or Emergency Recovery). To describe this concern I'll give you a short explanation about DR.

DR can best be divided into two objectives:

  1. RPO (Recovery Point Objective) and
  2. RTO (Recovery Time Objective).
 
RPO
RPO is the maximum tolerable period in which data might be lost from an IT service due to a major incident. In other words how much data (measured in time) is an acceptable loss in case of an emergency.

RTO
RTO is the duration of time in which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity. In other words in how much time does the service(s) need to be restored in case of an emergency.

So how does this all related to Exchange Server 2013? Well I will try to explain this in the following paragraphs.

Backing up Exchange Server 2013

Third party backup solutions

At the moment of writing this article the support of third party backup solution/providers to backup Exchange Server 2013 is marginal. The following table gives you a better understanding of the most common ("enterprise ready") backup solutions and their support of Exchange Server 2013.

Note: From a Microsoft statement all backup solutions need to make use of the Volume Shadow copy Service(VSS) in order to create a successful and consistent backup. For more information about these requirements click here.




 

Solution

Supported?

Level

1.

Symantec NetBackup

Support from version 7.5.0.6.

Database

2.

Symantec BackupExec

Support from version 2012 Service Pack 2

N/A

3.

NetApp SnapManager

Supported in version 7 and higher

Database

4.

CommVault

Supported in version 9 and higher

Database

5.

VEEAM

No support. Support is going to be in version 7. Release date unknown

N/A

6.

HP Dataprotector

Support from version 8.

Database

7.

EMC Avamar/Networker

No support

N/A

8.

IBM Tivoly Storage Manager

No Support

N/A
 
As you can see there isn't much support from third party products for Exchange Server 2013 yet. Why suppliers of backup software don't have a solution yet is unclear. But the question is, is this a potential problem when you want your organization to move forward in implementing Exchange Server 2013? Personally I think not. Better saying, I personally don't think you'll need a third party backup solution at all! And why is that you say?

Well the explanation is pretty simple. In Exchange (of course if you design it properly) all features to eliminate both backup concerns are built into Exchange. In the next paragraphs I will go deeper into it, so keep on reading ;).
 
Exchange Item Restore
When you ask your customers or the management of your organization if it is really necessary to have their single items back from backup in case of a user error, they probably say yes. But if you ask them till what point in time, they most of the time don't have a direct answer. If you then ask them if they are comfortable to have a restore period of let's say 1 month for recoverable items they probably say that it is ok. You have to keep in mind restoring single items has limitations. In case of a single item restore (not possible yet in combination with Exchange Server 2013) this brings long backup times and probably performance loss.

Exchange however has the ability to keep deleted items for a specific period of time. This is called retention policies. By default all deleted item's (by means items that are removed from the users "Deleted Items" folder) are saved for 14 days. This means that users are able to restore them within 14 days themselves from within Outlook.

So to for fill the need to restore single items you can simply use or extend the retention policy for recoverable items. This is done on the database. You'll however have to keep in mind that you'll need to calculate this in your mailbox storage requirements design.

The advantage of this approach are numerous:

  • It saves you a lot of time to backup single items with any software;
  • It saves you storage in case of snapshot backups on storage level;
  • It saves you storage on your backup tapes;
  • It saves your IT Helpdesk the burden to answer call's about restore of single items;
  • And last but probably the most important, users don't have to call the IT department anymore. They can do it themselves! And that means, one step forward in pissing of users ;).

Exchange HA and Site Resiliency
Great! And what about Disaster Recovery I hear you say? Well Exchange has a built-in solution for that to. It will require you to think well about your design so I only describe the features and technologies needed to achieve the goal.

Since Exchange Server 2010 there is a new thing called Database Availability Groups  or DAG's. DAG's are the successor of the pain in the ass Continues Replication Cluster (CCR) which was available in Exchange Server 2007. Exchange Server 2013 the use of DAG's is continued and improved. With a DAG you can create High Available passive copies of your mailbox databases over up to 16 Exchange Mailbox Servers. The advantage of a DAG is that (although MS Cluster Services is still used on the background) the configuration is relatively simple. You'll need however extra storage for every copy of the database. It is also possible to divide your DAG's over separated Data Centers to ensure services continue to be available and data loss is kept at a minimum. This tackles your direct HA requirement.

But what if for whatever reason your active database gets corrupted? Are my passive copies then also affected? Uhhh yes they probably are. The reason for this is that each copy of an active database in a DAG is seeded (kept up-to-date) by using transaction log shipping. If corruption is inserted in a database the log will simply be played into a copy too.

But don't worry there is a solution for this and that's called "lagged copies". In every DAG you can create next to regular HA copies a Lagged Copy. A lagged copy simply means that you tell Exchange to insert a lag (delay in time) before it commit's changes to the database. Therefore if data gets corrupted in a database the lag will ensure the corruption is not directly in the lagged copy.

The use of Lagged Copies are there since Exchange Server 2007. And therefore also in Exchange Server 2010. However lagged copies where a bit hard to handle in Exchange Server 2010. Also if the organization needs a 0 day RPO it was simply not possible because the logs where gone if all "normal" copies of the databases where not there anymore and therefore the mail queue was empty.

In Exchange Server 2013 this issue is solved by a feature called Safety Net. Safety Net is the successor of the Transport Dumpster and is a layer that is not a part of the databases or the DAG. What Safety Net does is when a transaction is required (incoming or outgoing mail for example) it holds the message until the message is delivered in all the copies (including the lagged copy) of the databases in a DAG.




This all basically means that without any backup software you can tackle item level restore and you can reach a 0 day RTO and RPO together. Of course your design needs to be right and you'll need enough data centers and servers to do the job for you.

Accreditations
A special thanks to Martijn Moret (Data Management Consultant at PQR, @MMMoret) to provide me a table of all backup providers and their support of Exchange Server 2013.

Updates

09-07-2013: Updated support matrix for Symantec NetBackup and HP Dataprotector
09-08-2013: Updated support matrix for Symantec BackupExec