Searching mailbox content
The
content index catalogs that are maintained by Search Foundation for
mailbox databases are critical to the ability of Exchange to perform
searches. If the catalogs are unhealthy or not fully populated, search
results will be unpredictable or incomplete. Administrators don’t have
to do anything to configure or manage Search Foundation because it is
installed automatically on every Mailbox server.
Exchange
uses the same content indexes for searches by clients, including
Outlook Web App and Outlook. However, Outlook uses the Exchange content
indexes only when it is configured to work in online mode. Most Outlook
clients are now configured in cached Exchange mode, so they use local
search indexes created with Windows Desktop Search to conduct searches
even when connected to a server.
Items are typically indexed
within a few seconds of their creation on a server. The contents of
attachments are also indexed, with the caveat that some attachments
might fall into the unsearchable category. Indexing throttles back automatically
in periods when Mailbox servers experience heavy load. Again,
administrators don’t have to take any action for this to happen.
Exchange
can search across items stored in mailboxes, archive mailboxes, and the
Recoverable Items structure (and site mailboxes if this feature is
deployed). It cannot search through deleted mailboxes, even if the
mailbox content is still in the database because it hasn’t exceeded the
deleted mailbox retention period. The searching process typically
follows these steps:
Determine the material that needs to be found and identify the criteria that can be used for a search, such as keywords.
Create a mailbox search using the initial criteria.
Perform an estimate (a search without retrieval) to assess the effectiveness of the criteria.
Refine the criteria to generate search results more accurately; iterate until satisfied.
Retrieve information from user mailboxes by using the final search criteria and store it in a discovery mailbox.
Have the retrieved information reviewed by investigators.
Underneath
the hood, Exchange uses the New-MailboxSearch cmdlet to create new
searches and its companions, the Get-MailboxSearch,
Start-MailboxSearch, and Stop-MailboxSearch cmdlets, to set search
criteria, start or resume a previously defined search, and stop a
search. Information is retrieved from the content indexes that Search
Foundation maintains. If Exchange captures information from Lync 2013
conversations such as the details of phone conversations (telephone
numbers of participants and so on), text for instant messaging
conversations, or files that are shared between participants, this
information is discoverable through mailbox searches. The same is true
if SharePoint 2013 and Exchange 2013 are connected to provide the
ability to conduct a single search across mailboxes and SharePoint
sites from the SharePoint eDiscovery portal.
As
mentioned earlier, EAC executes the New-MailboxSearch command when it
needs to search mailboxes. Exchange also supports the Search-Mailbox
command. Although EAC doesn’t use this cmdlet, you can run it in EMS to
perform a search. The big difference between the two cmdlets is that
Search-Mailbox can also delete content it finds, which is a useful
facility when dubious or unwanted material has arrived on a server and
been delivered to several mailboxes. If this happens, you can run
Search-Mailbox to locate the unwanted items and remove them in one
operation. For example, the following command scans all the mailboxes
on a server, looking for items received during a particular period that
contain the term “Great Bargain.” Any matching items are copied to a
discovery mailbox, where they are stored in a folder under the Search
Results root. The items are then deleted from the source mailboxes.
Get-Mailbox –Server ExServer2 | Search-Mailbox -SearchQuery "Received: > $('01/01/2013 00:00:00') AND Received: < $('01/31/2013 23:59:59') AND Great Bargain" -LogLevel Full -TargetMailbox 'Legal Discovery Mailbox' -TargetFolder 'Search Results' -DeleteContent
The
Search-Mailbox command can wreak havoc on mailboxes if it is not
controlled. Before any attempt is made to remove mailbox content, it is
wise to run the command in a log-only mode so that the results are
reported and copied to a discovery mailbox without any deletions. When
you are satisfied that the search query locates the right content in
the right mailboxes, you can run the command again, this time including
the DeleteContent parameter, and the content will be removed. Depending
whether a hold exists on the target mailboxes, some of the information
might be retained. However, this should not be an issue because the
offending information is no longer available to users.
The legal hold mechanism available in Exchange 2010 retains
mailbox data that might be required for litigation. However, a legal
hold operates on an all-in basis that has a side effect of holding
everything in the mailbox, including information that is not necessary
for legal review. For instance, if you place a mailbox on litigation
hold because the mailbox’s owner is involved in a patent action about a
certain invention, the information the lawyers require is most likely
anything to do with the technology covered by the patent and any
interaction between the inventor and co-inventors or advisors as the
invention evolved from idea to practical implementation. Messages to
the inventor’s aunt asking her to come to tea on a rainy Saturday in
March 2013 are unlikely to add much to the knowledge of the lawyers who
seek to defend or attack the patent, but because Exchange retains
everything in a mailbox when it is on litigation hold, the possibility
exists that the item might be provided to the lawyers for review,
something that drives up legal costs without adding anything of value
to the discovery process.
The
mailbox search mechanism provided in Exchange 2010 is simple to
understand and execute. It is capable of uncovering vast amounts of
information gathered from user mailboxes on an on-demand basis.
However, like many first versions of solutions, these mailbox searches
are a blunt instrument.
A more evolved and comprehensive approach is taken in Exchange 2013, with the goal being to satisfy the following requirements:
Preserve
items on hold immutably. That is, neither users nor a computer process
can take action to alter the information stored in items that are on
hold. To meet this requirement, steps must be taken to ensure that
users cannot delete items from mailboxes that are on hold and that the
MFA cannot remove items either.
Provide
a method to target items that need to be held rather than having to
hold everything in a mailbox. Sometimes it is necessary to hold
everything, but more often, investigators are interested in specific
information.
Preserve items indefinitely or for precise periods.
Allow mailboxes to be governed by different hold conditions arising from multiple investigations.
Make
holds transparent to users; users should not have to alter the way they
work just because it is necessary to hold information in their mailbox.
Make sure that any information that is held is indexed and remains discoverable.
Exchange
2010 satisfies many of these requirements and preserves data immutably.
The biggest advance in Exchange 2013 is the introduction of the
in-place hold mechanism to enable a more granular form of information
preservation and more efficient searches. The idea is that instead of
forcing Exchange to retain everything in a mailbox, you can formulate a
precise query to identify the information in which the lawyers are
really interested, instruct Exchange to hold that information, and
ignore everything else. In-place holds make sure that information is
retained in mailboxes until it is required and that any deletions users
perform are captured so that the information can always be retrieved.
In effect, Exchange 2013 supports three distinct types of mailbox
searches:
Query-based in-place holds. These
searches identify a set of criteria that Exchange uses to retain
information in the mailbox. Items that meet the hold criteria remain in
their normal folders unless an attempt is made to delete them, in which
case Exchange retains the items in a special subfolder of the
Recoverable Items folder. Microsoft presents the close integration of
in-place holds with mailbox searches as a considerable advantage for
Exchange 2013.
In-place hold without criteria. These
searches require Exchange to retain information but don’t set specific
criteria. This forces Exchange to retain everything in a user mailbox.
Regular. These
searches do not require Exchange to hold any information in mailboxes.
They simply search whatever the mailboxes currently contain, using the
criteria provided by the person who creates the search. This is the
kind of search Exchange 2010 supports.
Up to five
in-place holds can be in effect on a mailbox to handle situations when
it is impossible to gather all required material based on one query or
when an individual comes under the aegis of multiple legal actions. You
can access details of the holds that apply to a mailbox through EAC.
The details pane shows the number of holds that apply for the selected
mailbox (Figure 2),
and you can click View Details to have EAC show you the names of the
holds. Whenever multiple holds are in effect for a mailbox, Exchange
combines the queries by using an OR operator to locate matching items.
You can apply more than five in-place holds to a mailbox, but when this
happens, Exchange applies a complete hold to the mailbox to reduce the
complexity of the combined search terms specified by the various holds.
In effect, when you ask Exchange to resolve search term OR search term
OR search term OR search term OR search term OR search term OR search
term, the resulting set of items would probably be the entire mailbox
anyway. The same is true if you put a mailbox in litigation hold
alongside in-place holds—all the information in the mailbox is kept.
The
information EAC shows about in-place holds comes from the mailbox
properties. You can retrieve the same information by using EMS as
follows:
Get-Mailbox –Identity 'Tony Redmond' | Format-List DisplayName, InPlaceHolds
The information in Figure 2
indicates that two holds are in place, each of which is identified by a
GUID that ties to the full information about the hold maintained in the
Exchange configuration data in Active Directory.
When query-based
searches are used, all existing data that meets the hold criteria, plus
any new data created during the lifetime of the hold that meets the
criteria, are retained. Like retention holds, in-place holds can be
time-based and have start and end dates that control how long
information is kept. Date-controlled holds are useful because they
ensure that data is kept for precise periods and will be deleted
thereafter. Using your patent example, you could create a
date-controlled hold combined with a query to ensure that any items
relating to patent applications are retained. It is also possible to
create a date-controlled hold that retains items for a specific number
of days based on their creation date. Unlike retention holds, mailboxes
are unlikely to run into quota exhaustion because only certain items
are retained rather than everything.
It is also possible to
emulate the way litigation holds work in Exchange 2010 by creating an
in-place hold that has no query parameters or start and end dates to
force Exchange to hold all data in the target mailboxes indefinitely or
until the in-place hold is removed.
The
Exchange 2010 Exchange Control Panel (ECP) offers a Discovery
Management section from which administrators can create and execute
multimailbox searches. The equivalent place in EAC is the In-Place
eDiscovery And Hold option of the Compliance Management section (Figure 11-24),
which is where you formulate the queries that underpin in-place holds
and execute search options to estimate, preview, and retrieve
information specified by queries from user mailboxes.
Not all the searches shown in Figure 3
are based on in-place holds; some searches remain similar to the
simpler query-based type Exchange 2010 uses. For instance, if you run
the following command in EMS, you can see which searches use in-place
holds:
Get-MailboxSearch | Format-Table Name, InPlaceHoldEnabled
In
addition to the InPlaceHoldEnabled property being set to $True if a
search uses an in-place hold, the InPlaceHoldPeriod property will be
set to either Unlimited (indicating that items should be retained
indefinitely) or a certain number of days to govern exactly how long
items are retained.
In-place
holds work for mailboxes on Exchange 2013 servers only, so if you have
some mailboxes on Exchange 2007 or Exchange 2010 servers and need to
conduct a search to satisfy a discovery action, you will have to:
Move the mailboxes to an Exchange 2013 server.
Arrange
to conduct searches on the different server platforms and then collate
the results. This is easy enough to do using multimailbox searches for
Exchange 2010 mailboxes; you will need a third-party product to do the
same for Exchange 2007 servers.
From the
preceding, it’s obvious that the best idea is to move any mailbox that
might be involved in a discovery action to Exchange 2010 or Exchange
2013 (preferably) as quickly as possible. Remember that deleted items
are not moved when mailboxes migrate from Exchange 2007 servers because
this version of Exchange does not use the enhanced Recoverable Items
structure found in Exchange 2010 and Exchange 2013.
It is not a
good idea to move mailboxes from Exchange 2013 to earlier versions of
Exchange if they contain information that might be of interest to
searches. As discussed earlier, retention and litigation holds remain
in place only if the mailbox moves to an Exchange 2010 server. In-place
holds are a new feature of Exchange 2013. Mailboxes that move to
Exchange 2010 or Exchange 2007 disappear from searches that use this
feature.