Designing your Exchange
organization for availability and recovery scenarios can protect against
database corruption, hardware failures, accidental loss of user
messages, and even natural disasters. As an administrator, your job is
to make sure that servers are available and that data can be recovered.
1. Ensuring Data Availability
With Exchange Server 2010, it is
easier than ever to design a highly available (HA) solution that ensures
the availability of most messaging services. Simply by deploying multiple
Hub Transport, Edge Transport, and Client Access servers and placing
the additional servers within the appropriate Active Directory sites,
you can ensure availability of key messaging services if a primary Hub
Transport, Edge Transport, or Client Access server fails.
When it comes to
Mailbox servers, you can use several techniques to improve availability
and avoid having to restore from backups:
Database copies
Each member server in a database availability group (DAG) can have one
copy of a database that is hosted and active on another member server.
Exchange uses continuous replication to create and maintain copies of
databases.
Deleted item retention Deleted item retention allows users to restore a single item or an entire folder in Microsoft Office Outlook.
Deleted mailbox retention
Deleted mailbox retention allows administrators to restore deleted
mailboxes without having to restore the mailboxes from backups.
Archive mailboxes
Archive mailboxes are used to store users' old messages, such as may be
required to comply with company policy, government regulations, or
legal requirements.
Retention policy
Retention policy is applied to enforce message retention settings. When
messages reach a retention limit, they are processed by Exchange based
on the actions defined in retention tags that have been applied. This
allows messages to be archived, deleted, or flagged for user attention.
Multiple mailbox databases
By using multiple mailbox databases, configuring storage appropriately,
and distributing users across these databases, you can reduce
significantly the impact of the loss of a single database and allow for
faster restores when needed.
With all of these features in
place and configured appropriately, you might not need traditional
point-in-time backups of Mailbox servers or long-term data storage on
tape. Keep the following in mind:
For high
availability, Microsoft recommends having at least three highly
available database copies, which means one active copy and at least two
highly available passive copies. A highly available copy is a copy that
does not have a replay lag time and is not blocked for activation by an
administrator. This means you'll need at least three Mailbox servers in
each highly available DAG.
For disaster
recovery, I recommend having at least four database copies, which means
one active copy, at least two highly available passive copies, and at
least one lagged copy. A lagged copy is a copy that has a replay lag
time. This means you'll need at least four Mailbox servers in each DAG
optimized for disaster recovery.
As part of your
Exchange organization planning, you should have at least one Mailbox
server in each Active Directory site. Because any database availability
group can be extended to multiple sites, you don't necessarily need to
have multiple Mailbox servers in each site, and having multiple site
locations for Mailbox servers can help protect you against data center
failures. For example, if Site A goes offline but Site B remains
available, users who normally access their mailboxes in Site A would be
redirected automatically to the appropriate Mailbox server in Site B, as
long as you've configured your database copies appropriately.
To eliminate the need for
point-in-time backups, you need at least two highly available copies in
addition to the active copy. I also recommend having a lagged copy of a
database. Having a lagged
database copy can help safeguard against data corruption that is
replicated to the databases in a group, resulting in a need to return to
a previous point in time. Keep in mind that deleted item retention and
hold policy are your first line of defense in case of accidental
deletion of mailboxes and mailbox data.
If you have multiple
copies of an active database, you'll likely want one of these copies to
have a long lag time. What is sufficiently long is subject to
interpretation and the needs of your organization. Ideally, the lag time
would be sufficient to allow someone to identify a problem and for an
administrator to begin recovery. In a 24x7 environment, where
administrators are always available, a sufficient lag time may be 12,
24, or 48 hours, depending on your needs. In other environments, a
sufficient lag time likely would be measured in multiple days.
Note:
REAL WORLD
Lagged copies are designed for disaster recovery purposes and
specifically to protect against store logical corruption. The
combination of mailbox database copies, hold policy, and Extensible
Storage Engine's (ESE) single page restore leaves only the extremely
rare but catastrophic store logical corruption case to deal with. The
decision on whether to use a lagged copy should depend primarily on
which third-party applications you use and your organization's history
with store logical corruption.
Some examples of strategies for small, medium, and large organizations are shown in Table 1. When you create or configure database copies using Add-MailboxDatabaseCopy or Set-MailboxDatabaseCopy,
respectively, you can use the –ReplayLagTime parameter to specify how
long the Exchange Information Store service should wait before replaying
log files, and use the –TruncationLagTime parameter to specify how long
Exchange Replication service should wait before truncating logs that
have been inspected on all copies. The maximum replay lag time is 14
days, as is the maximum truncation lag time.
Table 1. Database Copy Strategies for Small, Medium, and Large Organizations
ORGANIZATION SIZE | RECOVERY NEEDS | RECOMMENDATION |
---|
SMALL | Low | At least 2 HA database copies. |
| High | At least 2 HA database copies with at least one lagged copy (3 to 7 days). |
MEDIUM | Low | At least 2 HA database copies with at least one lagged copy (3 to 7 days). |
| High | At least 2 HA database copies with at least one lagged copy (24 to 72 hours). |
LARGE | Low | At least 2 HA database copies with at least one lagged copy (3 to 7 days). |
| High | At least 3 HA database copies with at least one lagged copy (24 to 72 hours). |
While database copies may be
able to eliminate the need for point-in-time database snapshots,
database copies alone won't eliminate the need for long-term storage of
backups. To reduce the need for long-term storage of backups, you must
properly implement deleted item retention and deleted mailbox retention.
In most cases, you'll want deleted items and mailboxes to be retained
for at least 30, 60, or 90 days. In addition, you'll want to apply
retention policy to enforce message retention settings and configure
archiving as appropriate for your organization. If you want to eliminate
the need for long-term storage of backups, you'll want to supplement
these strategies with database copies written to Mailbox servers in
multiple geographic locations. For example, if you have offices in Los
Angeles and Sacramento, your Los Angeles office could act as the
failsafe recovery location for your Sacramento office and vice versa. As
such, each office location would have a Mailbox server with a copy of
each active mailbox database. If your organization doesn't have multiple
geographic locations, you'll likely want to continue to create archival
backups and rotate them to offsite storage as appropriate.
2. Backing Up Exchange Server: The Basics
To create a complete backup of an Exchange server, you must back up the following:
Exchange user
data, which includes Exchange mailbox databases, public folder
databases, and transaction logs. If you want to be able to recover
mailbox and public folder databases from backups, you must back up this
data. User data doesn't contain Exchange configuration settings.
Folders
and drives that contain Windows and Exchange files. Normally, this
means backing up the root drive C, which includes the special partition
for Exchange Server.
System
state data for the operating system, which includes essential system
files needed to recover the local system. All computers have System
State data, which you must back up in addition to other files to restore
a complete working system. To be clear, system state data is needed for
Windows recovery but is not needed for Exchange recovery.
Because Exchange Server 2010 supports only VSS-based backups, volumes are the units of
backup. You back up a volume and then you restore the data, get the
portion of the data you want to work with, and recover that. Although
you can recover an individual database from backup, you should know
about some fundamental issues before you try to do so. These issues
pertain to transactions, transaction logs, and transaction logging modes.
The Exchange Information
Store service creates transaction logs. Exchange Server uses
transactions to record database changes. You can think of a transaction
as a logical unit of work that contains one or more operations that
affect the Exchange store. If Exchange Server executes all of the
operations in a transaction successfully, it marks the transaction as
successful and permanently commits the changes. If one or more of the
operations in a transaction fails to complete, Exchange Server marks the
transaction as failed and removes any changes that the transaction
created. The process of removing changes is referred to as rolling back the transaction.
Transaction logs are
units of storage for transactions. Exchange Server writes each
transaction to a log file and maintains the log files according to the
logging mode. With standard
logging, Exchange Server reserves 1 megabyte (MB) of disk space for the
active transaction log. Exchange Server commits or rolls back
transactions based on their success or failure. When the contents of the
log reach 1 MB, Exchange Server creates a new log file. Because
Exchange Server maintains the transaction logs until the next full
backup, you can recover Exchange Server to the last transaction (as long
Exchange is using standard logging).
Note:
The active transaction log is
named E##.log, where ## is the unique identifier for the database.
Additional transaction logs are named E##00000001.log, E##00000002.log,
and so on.
Exchange can use standard logging or circular
logging. With standard logging, each database transaction is written to
a log file and then to the database. When a log file reaches 1 megabyte
(MB) in size, the Exchange Information Store service renames the log
and creates a new log file. If Exchange stops unexpectedly, you can
recover databases to the last transaction. You do this by replaying the
data from the log files into the database.
In Exchange 2010, circular
logging is disabled by default. With circular logging, Exchange
overwrites and reuses the first log file after the data it contains has
been written to the database. This saves disk space but is not a
recommended best practice. Why? One of the reasons is that when circular
logging is enabled, you can recover data only up until the last full
backup.
3. Creating a Disaster Recovery Plan Based on Exchange Roles
With Exchange Server 2010, you
need to tailor your recovery plan to the roles installed on your
Exchange servers. Because most configuration data for Exchange Server
2010 is stored in Active Directory, you can fully restore some server
roles by running the Exchange Setup program with the /mode:recoverserver
command on a server. With other roles, running this command restores
the Exchange configuration, but you need to recover the critical
Exchange data from backup.
Note:
Recoverserver
mode is only for recovering a server or moving a server to new hardware
while maintaining the same server name. When you run Setup in this
mode, Setup reads configuration data from Active Directory for a server
with the same name as the server from which you are running Setup. This
mode doesn't recover custom settings stored locally or in databases; it
recovers only settings stored in Active Directory.
Use the following guidelines for your recovery planning:
Mailbox servers You can fully restore the Mailbox server role by running the Exchange Setup program with the /mode:recoverserver
command. However, you can't recover databases with this mode. You can
restore mailbox data from a backup that includes the necessary mailbox
data. Mailbox servers store Exchange database files, including both
mailbox and public folder databases, and Exchange transaction log files
specific to each database. You can rebuild database copies by
re-creating them. You can also rebuild replicated public folder data
through the normal replication process if there are available replicas.
Mailbox servers also store full-text indexing information specific to
each mailbox database.
Hub Transport servers
You can restore the Hub Transport server role and make it fully
functional by running Exchange Setup with the /mode:recoverserver
command. Hub Transport servers store all essential configuration data in
Active Directory. In addition to configuration data, Hub Transport
servers store queues in database files and any logs you've enabled,
including message tracking, protocol, and connectivity logs. Queues
store messages actively being processed, and logs are primarily used for
historical reference and troubleshooting. Queues and logs are not
essential to restoring Hub Transport server functionality. Because of
shadow redundancy, any queued messages are automatically resent and do
not need to be recovered (unless you have only one Hub Transport
server).
Edge Transport servers
The Edge Transport server role is designed for perimeter network
deployment. You can restore the Edge Transport server role and make it
fully functional by using a cloned configuration. Edge Transport servers
store configuration data, queues, replicated data from Active
Directory, and any logs you've enabled, including message tracking,
protocol, and connectivity logs. Replicated data from Active Directory
is stored in Active Directory Lightweight Directory Service (AD LDS).
Queues store messages actively being processed, and logs are primarily
used for historical reference and troubleshooting. Replicated data,
queues, and logs are not essential to restoring Edge Transport server
functionality. Replicated data can be resynchronized as necessary, and
both queues and logs are created automatically as necessary.
Client Access servers
You can restore the Client Access server role to its initial default
state by running Exchange Setup with the /mode:recoverserver command.
However, any custom changes you've made to Web sites running on a Client
Access server are not restored. Changes to Web sites are stored in the
Internet Information Services (IIS) configuration data. Although you can
restore the IIS configuration data from backup to recover the custom
settings, this is not recommended because you might experience errors on
the Client Access server if the IIS configuration data and the
recovered Active Directory settings aren't exactly in sync. To restore a
Client Access server, you can build a new server with a new name by
running Exchange Setup, or you can restore the old server with the same
name by running Exchange Setup with the /mode:recoverserver command.
When Setup finishes, you then need to apply the same customizations that
you had on the server before, re-creating additional Web sites and
virtual directories as necessary. To apply the setting changes, you
should restart IIS.
Unified Messaging servers
The Unified Messaging server role stores all of its essential
configuration data in Active Directory, and you can restore a server to
its initial default state by running the Exchange Setup program with the
/mode:recoverserver command. In addition, you can restore any custom
audio files used for prompts automatically through replication if you
have other UM servers in the organization.
4. Finalizing Your Exchange Server Disaster Recovery Plan
As you've seen, creating a disaster
recovery plan for Exchange Server 2010 requires forethought on your
part. As part of your planning, you also need take a close look at the
overall architecture of your Exchange organization and make any changes
required to ensure that the architecture meets the availability and
recoverability expectations of your bosses. You need to review
The number of Exchange servers to use in your organization.
Do you need multiple servers to ensure high availability? Do you need
multiple servers to improve performance? Do you need multiple servers
because the organization spans several geographic areas?
The number of databases for each Exchange server, as well as how database availability groups are organized.
Do you need to create databases for each department or division in the
organization? Do you need to create databases for different business
functions? Do you need to create separate databases for public folders and other types of data?
After you've reviewed
the architecture of the Exchange organization and implemented any
necessary changes, you can create an availability and disaster
recovery plan to support that organization. If your plan includes
point-in-time backups, archival backups, or both, you need to figure out
what data you need to back up, how often you should back up the data,
and more. To help you create a plan, consider the following:
How important is the mailbox or public folder database you're backing up?
The importance of the data can go a long way in helping you determine
when and how you should back up the database. For critical data, such as
a department's mailbox database, you'll want to have redundant backup
sets that extend back for several backup periods. For less important
data, such as public folders for nonessential documents, you won't need
such an elaborate backup plan, but you'll need to back up the data
regularly and ensure that you can recover the data easily.
How available does the data need to be or how quickly do you need to recover the data?
Time is an important factor in creating a backup plan. You might need
to get critical data, such as the primary mailbox database, back online
swiftly. To do this, you might need to alter your recovery plan. For
example, you might need to create multiple mailbox databases and then
create a copy of each on multiple servers in a database availability
group. You can then recover individual databases or individual servers
as the situation warrants.
Do you have the equipment to perform backups?
If you don't have backup hardware, you can't perform backups. To
perform timely backups, you might need several backup devices and
several sets of backup media. Backup hardware includes tape drives, tape
library systems, storage arrays, and removable disk drives.
Who will be responsible for the recovery plan?
Ideally, someone should be the primary contact for the Exchange
recovery plan. This person might also be responsible for performing the
actual backup and recovery of Exchange Server.
What is the best time to schedule backups? Scheduling
backups when system use is as low as possible may speed up the backup
process. However, because you can't always schedule backups for off-peak
hours, you need to carefully plan when you back up data. For example,
you may want to back up a passive copy of a database rather than the
active copy.
Do you need to store backups off-site?
Storing copies of backup tapes off-site is essential to recovering
Exchange Server in a variety of situations, especially in the case of a
natural disaster. In your off-site
storage location, you should also include copies of all the software
you might need to recover Exchange Server and change management records
so that you can re-create custom settings after recovery.
5. Choosing Backup and Recovery Options
As you'll find when you work
with Exchange backup and recovery, there are many techniques for backing
up data. The techniques you use depend on the type of data you're
backing up, how convenient you want the recovery process to be, and
other factors.
To back up Exchange, you must use an Exchange-aware Volume Shadow Copy Service (VSS)–based backup program with Exchange. You cannot use streaming Extensible Storage Engine–based backup programs with Exchange.
Exchange Server includes a plug-in for Windows Server Backup that makes it possible for you to create VSS-based
backups of Exchange data. This plug-in runs as a service named
Microsoft Exchange Server Extension for Windows Server Backup and is
configured by default for manual startup. To use the plug-in, you must
install the Windows Server Backup feature on Exchange servers you want
to back up. The related command-line tools are not compatible with
Exchange Server 2010, however, and you should not install them.
Windows Server Backups
should be performed at the volume level. To back up a database and its
transaction logs, you must back up the entire volume containing the
database and logs. You cannot back up only the data. Although you can
create a backup on a local drive or a remote network share, you must be
logged on to the server to perform backups. You can log on locally at
the keyboard or via a remote desktop connection.
When you are using Windows
Server Backup to back up Exchange Server, you can perform only full or
copy backups. Both approaches back up all Exchange data that has been
selected, including the related databases and the current transaction
logs. However, only a full backup tells Exchange Server you've performed
a complete backup, which allows Exchange Server to clear out the
transaction logs.
In your backup plan, you'll
probably want to perform full or copy backups on a weekly basis. You
might also want to create an extended backup set for monthly and
quarterly backups that you rotate to off-site storage. This will ensure
that you have recent data for recovery as well as older data for
recovery.
When you restore data, you can
restore only Exchange data. The Exchange data can be restored to its
original location or an alternate location. Keep the following in mind:
If you restore
Exchange data to its original location, Windows Server Backup and the
backup plug-in automatically handle the recovery process. This means
they dismount any existing databases, replay logs into recovered
databases, and mount databases for you. All backed-up databases must be
restored together. You cannot restore a single database.
If you restore Exchange data to an alternate location, Windows Server Backup
and the backup plug-in do not handle the recovery process. This means
you need to manually work with the data and restore it. While this
requires more work, you can restore a single database as well as
individual mailboxes.
You can move manually restored data from the alternate location to a recovery
database (rdb). A recovery database is a special-purpose database that
allows you to mount a restored mailbox database and select data for
recovery. You create a recovery database using New-MailboxDatabase with the –Recovery parameter. You use Restore-Mailbox to extract data from a restored database.
Restore-Mailbox works
only with disconnected mailboxes. The disconnected mailboxes are
specified as the recovery sources, and the recovery targets are
connected mailboxes in an active mailbox database. Thus, you extract
data from a disconnected source mailbox and move it to a connected
target mailbox. The source and target mailboxes must be in the same
Active Directory forest.
A recovery database is completely disconnected. As a result, a recovery database:
Applies only to mailbox databases and does not apply to public folder databases.
Does not have system or mailbox management policies.
Is not enabled for logging or maintenance.
Cannot be accessed by or connected to by users.
A recovery database enables the following recovery scenarios:
Recovering or repairing a database. One recovery approach is to create a new empty database, called a dial-tone
database, to replace the failed database. This allows users to continue
sending and receiving mail while you are recovering a database. You can
then merge the recovery database with the dial-tone database.
Recovering
a database to a different server. One recovery approach is to restore a
database on a server other than the original server for that database.
As necessary, you can then merge the recovered data back to the original
database on the original server.
Recovering deleted
mailboxes or deleted items after the retention period has expired. One
recovery approach is to restore the database in an alternate location.
Then you would extract the required data and merge it into the existing
data.
Only one
recovery database can be mounted on a Mailbox server at one time. The
recovery database does not count toward the maximum number of allowed
databases per Mailbox server.