6. Storage in Exchange Server 2010
There have been notable changes in database storage
and architecture in Exchange Server 2010. These are real changes that
have a far-reaching impact on Exchange deployments and the overall
strategy that an organization takes with its messaging infrastructure.
Let's outline some of those changes, or at least the ones most relevant
to Exchange administrators:
Most write transactions to the Exchange databases are now performed as sequential writes rather than the traditional random
writes. Why should you care? Well, it means that now the hard disk
drive needle does not move; the disk is spinning, but the needle is not
moving. This change, which may seem like a detail at first, is
significant in reducing the number of required disks or IOPS for
Mailbox servers and improving overall database access performance.
(Note that some write transactions are still performed as random writes
to handle database space compactness or for other specific
architectural reasons.)
The database
itself has a brand-new schema, and this is the first major change in
schema since Exchange 2000. (By the way, this new database schema is
the main reason why you cannot perform an in-place upgrade from
Exchange Server 2007 to Exchange Server 2010.)
Well,
it can't all be good. Single Instance Storage (SIS) was the sacrificial
lamb and is no longer a database feature. (Keep in mind that SIS was
effectively gone in Exchange Server 2007 when it no longer applied to
email attachments.)
The database page
(each transaction resulting in new data creates at least one database
page) size has increased from 8 KB to 32 KB. In essence, what this
means is that a 5 KB message will require a 32 KB block of space in the
database. However, a 16 KB message is now stored in a single page,
rather than two pages as in previous versions of Exchange Server.
To
mitigate the risk of an increased database size, potentially caused by
the new larger page size and other new database architectural changes,
database pages are compressed.
Wouldn't it be great if you could walk into your
boss's office and ask for the budget to give every user a 20 GB mailbox
so they would never (well, not for a while at least) have to delete
anything? Then you could create as many databases on your Exchange
server as you could create before your fingers went numb and let the
users go to town.
Unfortunately, we all have constraints within which
we have to live; that goes for system administrators, end users, and
our VIP users. So, thinking about adding more storage and allowing
larger mailboxes or databases, what are some of the constraints that we
face? Some of these are technological in nature and some are budgetary
or political. We're hoping that you already know most of these and can
skim right through them:
Exchange Server 2010 Standard Edition supports a maximum of five mailbox databases. Exchange Server 2010 Enterprise Edition supports a maximum of 100 mailbox databases. For
previous editions of Exchange Server, the disk I/O limitations affected
storage design. In Exchange Server 2010, this limitation now has a
lessened impact on storage design. The
bigger a mailbox is, the longer it takes to back up and restore. For
typical backups of Exchange databases, the restore time will be twice
as long as the backup time. Microsoft
recommends a maximum Exchange database of 2 TB when you have two or
more copies of your databases. If you have a single copy of your
database, the recommended maximum size is 200 GB. You
need to plan for 7 to 10 days' worth of transaction logs; a good
starting point for estimating how much space transaction logs will
consume is about 9 GB of transaction logs for each 1,000 average users.
However, we will discuss later in "Managing Mailbox Databases" how some
organizations will want to enable Circular Logging and therefore not
require additional disk space to store transaction logs. If
you implement database replication with a database availability group
(DAG) and multiple replication partners, remember that log files will
only purge after a successful replication (even when Circular Logging
is enabled). Therefore, you must account for network outages where
replication will fail and transaction log files can queue on your
physical disks. Depending on the time necessary to troubleshoot or
repair the problem that is preventing successful replication, enough
disk space must be available before databases will begin to shut down. You
should assume that each database needs to contain 10 to 15 percent
additional space for deleted items (known as the database dumpster) and
for database whitespace. Also, note that whitespace can continue to
grow if the Online Maintenance process does not get to complete during
its scheduled interval. Make sure that your Online Maintenance is large
enough to allow a completed process.
|
7. An Additional Factor: The Personal Archive
A personal archive is what we call the "Siamese"
mailbox to a user's primary mailbox. It's a secondary mailbox that is
"joined at the hip" to a user's primary mailbox and provides a second
location for storing older, rarely accessed emails. Let's look at some of the
features unique to the personal archive:
A personal archive is created by using the Enable-mailbox <mailbox> -archive command.
A personal archive and a primary mailbox for a user must be stored in the same mailbox database.
The personal archive cannot be cached locally on an Outlook client through an offline store (OST).
A personal archive can only be accessed by Outlook 2010 or Outlook Web App 2010.
Personal archives allow administrators to provide larger storage solutions for users, while still providing access to all email.
We want to point out two features that have brought
about a lot of discussion. First, we often get the question, "Why are
Personal Archives relevant if they must be stored in the same
mailbox database as the user's primary mailbox?" At first glance, one
would think that an organization could benefit from Personal Archives
by having them stored in a separate mailbox database; and as you now
know, it is not the case. There would have been obvious benefits from
this, such as separate backup schedules for Personal Archives and
smaller databases for primary mailboxes. Microsoft's customers have
been vocal about the ability to separate the two, and we would not be
too surprised if this functionality becomes available through a future
Service Pack release.
So this leads us to the other feature, which comes
directly from the other question we often get, "If I can't store the
Personal Archive in a separate mailbox database, what is my immediate
benefit from implementing Personal Archives?" In our opinion, the
biggest benefit of using Personal Archives is the reduction in OST file
size. Since the Personal Archive is not available offline, it will
reduce OST file bloat, while still providing remote access through
Outlook Web App 2010. So for Archive Mailboxes, our opinion is, it's
great now, and it will likely get even better.
8. Disk Size vs. I/O Capacity
Historically, Exchange has been limited by the
performance of its disk, rather than by the disk space available on
those disks. In Exchange Server 2010, there has been somewhat of a role
reversal between those two characteristics. The improvements and
reductions in I/O requirements permit administrators to use lower-cost
SATA disks (or equivalent) to handle storage.
For many Exchange Server administrators (these
authors included), the knowledge of and understanding of disk I/O
capacity constraints came slowly. For some reason, we kept thinking
that the disk technology far outperformed the database capacity. But as
Exchange servers got more heavily loaded with more simultaneous users
and larger databases, the demands on the disk grew.
Let's take a look at a quick example. Say you have
an 18 GB SCSI disk from the olden days; that disk may be able to
support 100 reads and/or writes to the disk each second. That's not a
big deal if you have 50 users, but what if you have 500 users? Can the
disk subsystem service the I/O requests that those 500 users will put
on it? If the disk system is not properly sized—both for capacity and
for the required I/O load—then users will see performance problems.
This load is normally measured (and planned) in
terms of the IOPS profile of the users who will use the system. The
Exchange team at Microsoft has done much research into the type of load
that users place on an Exchange server; they have broken that down
based on different types of users, from a light user who may send 5 messages per day and receive 20, to an extra heavy user who may send 40 messages per day and receive 160.
Note that the reductions in IOPS between recent
Exchange versions are significant. Initial testing at Microsoft IT
demonstrates a reduction in IOPS of 70 percent when compared to
Exchange Server 2007.
9. What's Keeping Me Up at Night?
We spend quite a bit of time wondering if we have
our storage configuration optimized. Ask yourself these questions about
your own environment:
Am I giving my users enough mailbox space to store enough historical information to do their jobs? Or (shudder) too much?
Are users wasting mail storage on personal or non–work-related content such as MPG files of cats playing the piano (http://www.youtube.com/watch?v=npqx8CsBEyk)?
Should
I employ an email archival solution to move older content off the
mailbox database and on to alternative storage? Should I use the
built-in personal archive solution or should I use a third-party
solution? If I do, how much "recent" content should be left on the
Exchange server versus moved out to the archive?
Do I need to be keeping copies of certain types of messages (such as for regulatory, legal, or business reasons)?
Are my databases growing so fast that I may run out of disk space before I notice?
Do I have the right balance of databases, size of disk, frequency of backups, and deployment of redundancy?