Exchange Server 2010 Administration Essentials : Understanding Data Storage in Exchange Server 2010

8/7/2013 6:01:18 PM

Depending on its role, Exchange Server stores information in several locations, including:

Active Directory data store
Exchange Server store
Exchange Server queues

Working with the Active Directory Data Store

The Active Directory data store contains most directory information for Exchange Server 2010 configurations and Exchange Server 2010 recipients as well as other important directory resources. Domain controllers maintain the data store in a file called Ntds.dit. The location of this file is set when Active Directory is installed and should be on an NTFS file system drive formatted for use with Windows Server. Domain controllers save some directory data separately from the main data store.

Two key concepts to focus on when looking at Active Directory are multimaster replication and Global Catalog servers.

Using Multimaster Replication

Domain controllers replicate most changes to the data store by using multimaster replication, which allows any domain controller to process directory changes and replicate those changes to other domain controllers. Replication is handled automatically for key data types, including the following:

Domain data Contains information about objects within a domain, such as users, groups, and contacts
Configuration data Describes the topology of the directory, and includes a list of important domain information
Schema data Describes all objects and data types that can be stored in the data store

Using Global Catalogs

Active Directory information is also made available through global catalogs. You use global catalogs for information searches and, in some cases, domain logon. A domain controller designated as a Global Catalog server stores a full replica of all objects in the data store (for its host domain).

By default, the first domain controller installed in a domain is designated as the Global Catalog server. Consequently, if there is only one domain controller in the domain, the domain controller and the global catalog are on the same server. Otherwise, the global catalog is on the domain controller configured as such.

Information searches are one of the key uses of the global catalog. Searches in the global catalog are efficient and can resolve most queries locally, thus reducing the network load and allowing for quicker responses. With Exchange, the global catalog can be used to execute Lightweight Directory Access Protocol (LDAP) queries for dynamic distribution groups. Here, the members of the distribution group are based on the results of the query sent to the Global Catalog server rather than being fixed.

Why use LDAP queries instead of a fixed member list? The idea is to reduce administrative overheard by being able to dynamically determine what the members of a distribution group should be. Query-based distribution is most efficient when the member list is relatively small (fewer than 25 members). If the member list has potentially hundreds or thousands of members, however, dynamic distribution can be inefficient and might require a great deal of processing to complete.

Here's how dynamic distribution works:

When e-mail is received that is addressed to the group, the Exchange Categorizer (a transport component) sends the predefined LDAP query to the Global Catalog server for the domain.
The Global Catalog server executes the query and returns the resulting address set.
The Exchange Categorizer then uses the address list to generate the recipient list and deliver the message. If the Categorizer is unable to generate the list for any reason—for instance, if the list is incomplete or an error was returned—the Categorizer might start the process over from the beginning.

Note

To make the process more efficient, large organizations can use a dedicated expansion server. Here, LDAP queries are routed to the expansion server, which processes the query and returns the results.

Working with the Exchange Store

The Microsoft Exchange Information Store service (Store.exe) hosts the Exchange store. The Exchange store is the core storage repository for managing Exchange databases, which can include both mailbox databases and public folder databases. Mailbox databases contain the data, data definitions, indexes, flags, checksums, and other information that comprise mailboxes in your Exchange organization. Public folder databases contain the data, data definitions, indexes, flags, checksums, and other information that comprise any public folders in your Exchange organization.

Exchange Server uses transactions to control changes in databases. As with traditional databases, these transactions are recorded in a transaction log. Exchange Server then commits or rolls back changes based on the success of the transaction. The facility that manages transactions is the Microsoft Exchange Information Store service.

When working with databases, you should keep the following in mind:

Each Mailbox server can have up to 100 databases (including both active databases and passive databases), with a maximum size per database of 64 terabytes (TB)—limited only by hardware.
Each Mailbox server can be a member of only one database availability group and can host only one copy (either the active copy or a passive copy) of a database. Because each group can have up to 16 copies of a database, up to 16 different servers can be part of a database availability group.

To create a new mailbox or public folder database, you need about 50 megabytes (MBs) of free disk space. The files required by the database use a minimum of 23 MBs of disk space, and you'll need the extra space during creation and for read/write operations.

The Exchange store uses Extensible Storage Engine (ESE) databases for message storage. Key concepts to focus on when working with the Exchange store and databases are the following:

Which Exchange server data files are used
Which files are associated with databases
How data is stored in Exchange database files

Which Exchange Server Data Files Are Used?

With Exchange Server 2010, Mailbox servers have a single database file for each mailbox or public folder database. Unlike Exchange Server 2003, Exchange Server 2010 does not use a streaming Internet content file with the .stm file extension. Although the .stm file was used to store MIME-formatted messages in Exchange Server 2003 and earlier versions, Exchange Server 2007 and 2010 store all messages and attachments in the primary data file. Exchange doesn't use an .stm file because content conversion is no longer performed on Mailbox servers but is instead performed on Client Access servers. Since Mailbox servers no longer convert the data, they no longer need to store it.

Because attachments are encapsulated and written in binary format, you don't need to convert them to Exchange format. Exchange Server uses a link table within the database to reference the storage location of attachments within it.

Two types of databases are available:

Mailbox databases Contain mailboxes
Public Folder databases Contain public folders

Which Files Are Associated with Databases?

As Figure 1 shows, each database has a primary data file and several other types of shared working files and transaction logs.

Figure 1. The Exchange data store has primary data files for each database as well as working files.

These files are used as follows:

Primary data file (Database.edb) A physical database file that holds the contents of the data store. By default, the name of the data file is the same as the name of the associated data store with the .edb file extension added. However, you can rename a database without renaming the database file.
Checkpoint file (E##.chk) A file that tracks the point up to which the transactions in the log file have been committed to databases in the storage group. Generally, the name of the checkpoint file is derived from the database prefix.
Temporary data (Tmp.edb) A temporary workspace for processing transactions.
Current log file (E##.log) A file that contains a record of all changes that have yet to be committed to the database. Generally, the name of the log file is derived from the database prefix.
Secondary log files (E##00000001.log, E##00000002.log, …) Additional log files that are used as needed. Up to a billion unique log files can be created for each database.
Reserve log files (E##Res00001.jrs, E##Res00002.jrs, …) Files that are used to reserve space for additional log files if the current log file becomes full.

By default, the primary data file, working files, and transaction logs are all stored in the same location. On a Mailbox server, you'll find these files in a per-database subfolder of the %SystemRoot%\Program Files\Microsoft\Exchange Server\V14\Mailbox folder. Although these are the main files used for the data store, Exchange Server uses other files, depending on the roles for which you have configured the server.

How Is Data Stored in Exchange Database Files?

Exchange uses object-based storage. The primary data file contains several indexed tables, including a data table that contains a record for each object in the data store. Each referenced object can include object containers, such as mailboxes, and any other type of data that is stored in the data store.

Think of the data table as having rows and columns; the intersection of a row and a column is a field. The table's rows correspond to individual instances of an object. The table's columns correspond to folders. The table's fields are populated only if a folder includes stored data. The data stored in fields can be a fixed length or a variable length.

Records in the data table are stored in data pages that have a fixed size of 32 kilobytes (KBs, or 32,768 bytes). The 32-KB page file size represents a change from the 8-KB data pages used with Exchange Server 2007. This change was made to improve performance.

In an Exchange database, each data page has a page header, data rows, and free space that can contain row offsets. The page header uses the first 96 bytes of each page, leaving 32,672 bytes for data and row offsets. Row offsets indicate the logical order of rows on a page, which means that offset 0 refers to the first row in the index, offset 1 refers to the second row, and so on. If a row contains long, variable-length data, the data might not be stored with the rest of the data for that row. Instead, Exchange can store an 8-byte pointer to the actual data, which is stored in a collection of 32-KB pages that are written contiguously. In this way, an object and all its stored values can be much larger than 32 KB.

The current log file has a fixed size of 1 MB. The 1-MB log file size represents a change from the 5-MB log files used with Exchange Server 2003. This change was made so that Exchange Server 2007 could support continuous replication. When this log file fills up, Exchange closes and renames the log file (except when you are using circular logging). The secondary log files are also limited to a fixed size of 1 MB. Exchange uses the reserve log files to reserve disk space for log files that it might need to create. Because several reserve files are already created, this speeds up the transactional logging process when additional logs are needed.

Working with the Exchange Server Message Queues

Exchange Server message queues are temporary holding locations for messages that are waiting to be processed. Two general types of queues are used:

Persistent Persistent queues are always available even if no messages are waiting to be processed.
Nonpersistent Nonpersistent queues are available only when messages are waiting to be processed.

With Exchange Server 2010, both Hub Transport and Edge Transport servers store messages waiting to be processed in persistent and nonpersistent queues. Table 1 provides an overview of the queues used. In Exchange Management Console, you can view top-level queues by selecting Toolbox in the left pane and then clicking Queue Viewer.

Table 1. Queues Used with Transport Servers

QUEUE NAME	SERVER ROLE	NUMBER OF QUEUES	QUEUE TYPE
Mailbox delivery	Hub Transport	One for each unique destination Mailbox server	Nonpersistent
Poison message	Hub Transport, Edge Transport	One	Persistent
Remote delivery	Hub Transport	One for each unique remote Active Directory site	Nonpersistent
Remote delivery	Edge Transport	One for each unique destination SMTP domain and smart host	Nonpersistent
Shadow redundancy	Hub Transport, Edge Transport	One for each hop to which the server delivered the primary message	Nonpersistent
Submission	Hub Transport, Edge Transport	One	Persistent
Transport dumpster	Hub Transport, Edge Transport	One for each Active Directory site	Nonpersistent
Unreachable	Hub Transport, Edge Transport	One	Persistent

The transport dumpster was introduced with Exchange Server 2007. The transport dumpster queues messages that are being delivered to recipients whose mailboxes are stored in replicated mailbox databases. When a message has been replicated to all mailbox database copies, the message is removed from the transport dumpster. This ensures that the transport dumpster stores only nonreplicated data.

In addition to the transport dumpster, Exchange Server 2010 implements shadow redundancy for queued messages. In the event of an outage or server failure, this feature works to prevent the loss of messages that are in transit by storing queued messages until the next transport server along the route reports a successful delivery of the message. If the next transport server doesn't report successful delivery, the message is resubmitted for delivery.

Shadow redundancy eliminates the reliance on the state of any specific hub or edge server and eliminates the need for storage hardware redundancy for transport servers. As long as redundant message paths exist in your routing topology, any transport server is replaceable. If a transport server fails, you can remove it and don't have to worry about emptying its queues or losing messages. If you want to upgrade or replace a hub or edge server, you can do so at any time without the risk of losing messages.

Tip

Shadow redundancy uses less bandwidth than creating duplicate copies of messages on multiple servers. The only additional network traffic is the exchange of discard status between transport servers. Discard status indicates when a message is ready to be discarded from the transport database.

As Figure 2 shows, the various message queues are all stored in a single database. Like the Exchange store, the message queues database uses the ESE for message storage as well as for data pages.

Figure 2. The Exchange message queues are all stored in a single database.

The database has a single data file associated with it and several other types of working files and transaction logs. These files are used as follows:

Primary data file (Mail.que) A physical database file that holds the contents of all message queues.
Checkpoint file (Trn.chk) A file that tracks the point up to which the transactions in the log file have been committed to the database.
Temporary data (Tmp.edb) A temporary workspace for processing transactions.
Current log file (Trn.log) A log file that contains a record of all changes that have yet to be committed to the database.
Reserve log files (TRNRes00001.jrs, TRNRes00002.jrs, …) Files that are used to reserve space for additional log files if the current log file becomes full.

The facility that manages queuing transactions is the Microsoft Exchange Transport service (MSExchangeTransport.exe). Because logs used with message queues are not continuously replicated, these log files have a fixed size of 5 MB. When the current log file for message queues fills up, Exchange closes the current log file, commits it, and continues using the same named log file. Exchange uses the reserve log files to reserve disk space for log files that might need to be created. Because several reserve files are already created, this speeds up the transactional logging process when additional logs are needed.

By default, the data file, working files, and transaction logs are all stored in the same location. On a Hub Transport or Edge Transport server, you'll find these files in the %SystemRoot%\Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue folder.

Others

- Windows 8 Tile-Based Apps : Calendar

- Windows 8 Tile-Based Apps : Cloud Service Connections

- Windows 8 Tile-Based Apps : People

- Introduction to Sharepoint 2013 : SHAREPOINT CENTRAL ADMINISTRATION

- Introduction to Sharepoint 2013 : THE PLATFORM

- Introduction to Sharepoint 2013 : ADDRESSING THE NEEDS OF THE DEVELOPER

- Introduction to Sharepoint 2013 : GETTING TO KNOW SHAREPOINT

- SQL Server 2012 : Command-Line Tools

- SQL Server 2012 : Other Tools from the Start Menu

- Windows Server 2012 Overview : Introducing Windows Server 2012 (part 4) - Planning for availability, scalability, and manageability