With the shift of mailbox access over to
the Client Access server in Exchange Server 2010, the Mailbox server's
role now essentially encompasses only data storage and retrieval. The
primary focus of troubleshooting Mailbox servers rests on two things:
database replication health and server performance. These aren't the
only things Mailbox servers do, of course, but they're probably the two
most common troubleshooting topics. However, before we get into those,
let's recap some of the standard troubleshooting techniques you should
apply to a Mailbox server before diving into the situation-specific
tasks.
1. General Mailbox Server Health
Although a Mailbox server is essentially useless
without Client Access and Hub Transport servers to provide access and
deliver mail, it's still the most important role in an Exchange
environment. This is, of course, because the data is stored on the
server—in the databases on the associated storage, to be precise. So
when dealing with Mailbox server issues, you'll want to perform the
basic checks we covered in the general troubleshooting section earlier:
From the client workstation, can you ping the Mailbox server by NetBIOS name, FQDN, and IP address?
Are all required Exchange services able to start as necessary?
Do you see any errors in the event log relating to MSExchangeDatabase, MSExchangeDatabase => Instances, or MSExchangeIS Mailbox?
Are there any Active Directory issues that might have a negative impact on Exchange?
Obviously, the Test-SystemHealth and Test-ServiceHealth
cmdlets would be useful in detecting basic problems, like a dismounted
database or a stopped service. They should always be the first two
cmdlets you execute when troubleshooting a Mailbox server, simply
because they group together so many common checks.
2. Using Test-MapiConnectivity
Like its close cousin, Test-OutlookConnectivity, Test-MapiConnectivity
will help you determine problems accessing a specific mailbox. However,
unlike the other cmdlet (which tests the end-to-end process), Test-MapiConnectivity just focuses on the Mailbox server. It logs into a target mailbox (which you can specify with the –Identity parameter), the system mailbox in a specific database (which you can specify with –Database), or the system mailbox in every active database on a server (through –Server). The output for all three variants looks like the following:
Test-MAPIConnectivity -Server HNLMBX05
MailboxServer Database Result Error
------------- -------- ------ -----
HNLMBX05 MailboxDatabase... Success
HNLMBX05 MailBoxDatabase... Success
Test-MAPIConnectivity jmcbee
MailboxServer Database Result Error
------------- -------- ------ -----
HNLMBX05 MailBoxDatabase... Success
Test-MAPIConnectivity -Database MailboxDatabase-001
MailboxServer Database Result Error
------------- -------- ------ -----
HNLMBX05 MailBoxDatabase... Success
This is a useful (and quick) cmdlet for narrowing down the possible scope of a problem; Test-MapiConnectivity
essentially tests not only the Exchange information store but also
ADAccess and RPCs, so a successful test against any mailbox on a server
proves that those three components are at least functioning. If you can
log into the system mailbox for a database, but not into a user mailbox
in that same database, the problem is clearly something unique to that
user.
3. Checking Poison Mailboxes
One new feature that might lead to confusion for
users (and more than a few administrators!) is poison mailbox
detection. By default, Mailbox servers will tag any mailbox that causes
a thread in the Exchange Information Store service to crash or that is
connected to five or more "hung" threads. If a mailbox is tagged three
times in two hours, Exchange Server 2010 will block access to that
mailbox for up to six hours or until the administrator unblocks it,
whichever comes first. If a user reports that she cannot connect to a
mailbox, but other users have no difficulty, check to see if there are
any quarantined mailboxes on the server. You can do this either through
Performance Monitor (through the MSExchangeIS Mailbox\Quarantined Mailbox Count performance counter) or through the Get-MailboxStatistics cmdlet. For example, to find out if mailbox JanieN is quarantined, simply use this command:
Get-MailboxStatistics JanieN | Format-List DisplayName, IsQuarantined
Exchange Server 2010 will also write an event to the Application log when it quarantines a mailbox.
4. Checking Database Replication Health
The introduction of continuous replication in
Exchange Server 2007 dramatically changed the face of disaster
recovery, as administrators could deploy two separate copies of a
single database, each on a physically separate server. There were a few
limitations, of course; end users still connected to the server,
not just the database, so problems with the underlying cluster would
render both database copies inaccessible. Standby continuous
replication (introduced in Exchange Server 2007 Service Pack 1)
provided another disaster recovery option, but this had its limits as
well—it was purely manual and, depending on the configuration, would
require at least a setup "trick" (setup /recovercms) or even
wholesale "rehoming" of users. A successful activation of a standby
copy was also heavily dependent on replication of both DNS and Active
Directory information, so users might still be unable to connect even
after the issue was resolved.
Database availability groups (DAGs) in Exchange
Server 2010 provide multiple copies of a single database on different
servers, even in different datacenters, so a single server failure
should have a significantly smaller impact on an Exchange deployment.
Other architectural changes—namely RPC Client Access—effectively hide
the server object from the end user, so the actual location of the
active database is immaterial from the end user's perspective.
Database replication health is, loosely speaking,
how successful Exchange is keeping database copies in sync. This
depends on server configuration, network health, and a few other things
(most of which Exchange checks automatically as part of the Test-SystemHealth and Test-ServiceHealth cmdlets). However, you can check the health of the replication infrastructure quite easily with two cmdlets. The first cmdlet, Test-ReplicationHealth,
checks the health of the replication services and alerts you to any
errors it finds. The output is extremely easy to read, as shown here:
Test-ReplicationHealth
Server Check Result Error
------ ----- ------ -----
HNLMBX05 ClusterService Passed
HNLMBX05 ReplayService Passed
HNLMBX05 ActiveManager Passed
HNLMBX05 TasksRpcListener Passed
HNLMBX05 TcpListener Passed
HNLMBX05 DagMembersUp Passed
HNLMBX05 ClusterNetwork Passed
HNLMBX05 QuorumGroup Passed
HNLMBX05 DBCopySuspended *FAILED* Failures:...
HNLMBX05 DBCopyFailed Passed
HNLMBX05 DBInitializing Passed
HNLMBX05 DBDisconnected Passed
HNLMBX05 DBLogCopyKeepingUp Passed
HNLMBX05 DBLogReplayKeepingUp Passed
Once you've validated the replication services, you can check the replication status for the databases themselves with Get-MailboxDatabaseCopyStatus.
You can focus on a particular database by using the –Identity
parameter, or check the status for all mailbox database copies on a
specific server by using –MailboxServer. You could even check the
status of one specific database on one specific server by including
both parameters. Here is an example of using the Get-MailboxDatabaseCopyStatus cmdlet.
Get-MailboxDatabaseCopyStatus | Format-List
Name,Status,LastInspectedLogTime,ContentIndexState
Name Status LastInspectedLogTime ContentIndex
State
---- ------ -------------------- -----------
MDB001\HNLMBX05 Healthy 11/13/2009 8:44:03 AM Healthy
MDB002\HNLMBX05 Healthy 11/15/2009 8:03:24 PM Healthy
MDB003\HNLMBX05 Healthy 11/15/2009 8:12:56 PM Healthy
There are many possible causes for replication errors, among them:
With the reduction in functionality,
Mailbox servers have become significantly easier to troubleshoot than
in the past. There are a number of useful cmdlets for validating
mailbox database availability and mailbox access, among them Test-SystemHealth, Get-MailboxStatistics, and Test-MapiConnectivity. Two additional cmdlets, Test-ReplicationHealth and Get-MailboxDatabaseCopyStatus, provide insight into the replication of those databases across member servers in the organization.