IT tutorials
 
Technology
 

Microsoft Exchange Server 2013 : Preserving information (part 10) - Executing searches with EMS , What Exchange can search

11/21/2014 3:35:44 AM
- How To Install Windows Server 2012 On VirtualBox
- How To Bypass Torrent Connection Blocking By Your ISP
- How To Install Actual Facebook App On Kindle Fire

Executing searches with EMS

EAC is a very convenient interface with which to create and initiate searches, but you can do the same through EMS by using a set of cmdlets that are exposed only if you are a member of the Discovery Management role group. These cmdlets are as follows:

  • New-MailboxSearch. Creates and initiates a new mailbox search.

  • Get-MailboxSearch. Retrieves details of a mailbox search.

  • Set-MailboxSearch. Changes the search criteria for a search that has already been created.

  • Start-MailboxSearch. Restarts a mailbox search.

  • Remove-MailboxSearch. Removes a mailbox search. This action also removes from the discovery mailbox all the items a search finds.

  • Search-Mailbox. An older cmdlet that is now mainly used to search and remove content from user mailboxes. EAC does not use it.

For example, a new search to look for information about potential illegal stock trading by company officers could be initiated with this command:

New-MailboxSearch -Name "Stock Trading Discovery 2" -SourceMailboxes 'Company Officers'  -TargetMailbox 'DiscoveryMailbox@contoso.com' -StartDate '10/01/2010'
-EndDate '11/30/2010' -SearchQuery "XXE Stock tip" -StatusMailRecipients 'LegalSearch@contoso.com' –SearchDumpster –DoNotIncludeArchive –EstimateOnly
-IncludeUnsearchableItems –ExcludeDuplicateMessages:$False -LogLevel Full

Table 1 lists the most important parameters you are likely to use with the New-MailboxSearch cmdlet and their meaning.

Table 1. Important parameters for the New-MailboxSearch cmdlet

Parameter

Meaning

Name

A unique identifier for the search that should be something meaningful such as “Illegal stock trading review.”

SourceMailboxes

Specifies the mailboxes Exchange will search. If you have more than a few mailboxes to search, it is more convenient (and probably more accurate) to create a distribution group to identify the mailboxes to include in the search. If you don’t specify the –SourceMailboxes parameter, Exchange searches all mailboxes.

TargetMailbox

Specifies the SMTP email address of the discovery mailbox in which you want to store the search results. The default discovery mailbox has a rather long and complicated email address, so it’s easier to assign a new and shorter secondary email address to the mailbox to make it easier to type. In fact, this mailbox doesn’t have to be a discovery mailbox; Exchange is happy to place search results in any mailbox you select.

SearchQuery

A KQL-format query that Exchange executes to locate items in the target mailboxes. In the example, Exchange matches any of the words in the search query. This search query is a very simple one, and some trial and error is probably required to arrive at the best query. If you omit the search query, Exchange finds every item in every mailbox you include in the search and stores copies of all those items in the discovery mailbox. This kind of search can inundate a server with work.

StatusMailRecipients

Tells Exchange the recipients who should be notified by email after the search is complete. No message is sent if you don’t provide a value for this parameter. You can provide one or more recipient SMTP addresses to receive notifications, separating each address with a comma. It’s often more convenient to use a distribution group for this purpose.

SearchDumpster

Forces Exchange to include the contents of the Recoverable Items folder in the search. All searches executed through EAC include this parameter.

DoNotIncludeArchive

Instructs Exchange to ignore items stored in any personal archives that are assigned to mailboxes.

EstimateOnly

Tells Exchange that it is to run a search estimate only rather than copy items that match the search criteria to the discovery mailbox.

ExcludeDuplicateMessages

Tells Exchange how to deal with duplicate items it encounters in mailboxes. Set the parameter to $True to force Exchange to de-duplicate (copy only a single instance of an item) or $False to copy every copy of an item it finds.

LogLevel

Dictates the level of logging Exchange performs for the search. Valid options are Suppress, Basic (default), and Full. If Basic or Full is chosen, Exchange creates a search report in the root folder for the search in the discovery mailbox.

Inside Out Running concurrent searches

You can run concurrent searches as long as each search has a different name. The searches proceed a little more slowly because of contention when writing found items into the discovery mailbox. If you need to run concurrent searches on an ongoing basis, it would be a good idea to spread the load by creating several discovery mailboxes and locating them in different databases.

What Exchange can search

All item types stored in an Exchange mailbox database are discoverable, including voice messages, drafts, attached documents of various formats, and IM conversations (if stored in mailboxes). Exchange 2013 searches depend on Search Foundation to build and maintain context indexes extracted from mailbox databases. Although Search Foundation has no difficulty indexing the complete body text of messages because they are plaintext, rich text format, or HTML, some issues might be encountered with attachments, which can be in any format. Before Search Foundation can include the actual content for an attachment in its indexes rather than simply its metadata (such as the file name or author name), it must be able to extract the content. Search Foundation includes a large number of filters to enable it to deal with the most common file formats, including Microsoft Word, HTML, Microsoft PowerPoint PPTX files, and Adobe PDF. An additional set of IFilters, including those for Microsoft Excel, OneNote, older versions of PowerPoint, and Open Document files, is provided to Search Foundation when Exchange 2013 is installed on a Mailbox server (other IFilters are available from third-party vendors if you need to be able to index a specific format). Between Search Foundation and Exchange, a very large set of file formats can be indexed. To see the full set of searchable file formats on a server, you can run this command:

Get-SearchDocumentFormat | Format-Table Name, Extension, FormatHandler -AutoSize

The Set-SearchDocumentFormat command is also available to change the way Exchange processes particular formats. For example, you can disable indexing of particular types of files by running this command, a step that requires some forethought because of its potential impact on indexing and subsequent discovery operations.

Even though Search Foundation possesses out-of-the-box capabilities by which it can index the bulk of items encountered in an Exchange environment, it is possible for your company to create content in a format that Search Foundation does not know about. In this case, you must install an IFilter that supports the specific format on all Mailbox servers. Search Foundation then detects and uses the IFilter to include the items in that format in its indexes. If you do not install an IFilter, Search Foundation indexes the metadata for the items to allow searches to proceed, but these items will be deemed unsearchable and returned as such when you execute a search. Apart from application-specific files, other items Exchange deems unsearchable include items encrypted with Secure Multipurpose Internet Mail Extensions (S/MIME). However, messages protected with Active Directory Rights Management Services (AD RMS) remain searchable for discovery purposes.

When you decide to copy search results to a discovery mailbox, you can include unsearchable items. Normally, it’s a good idea to do this because the person assigned to review the search results might be able to discover what these unsearchable items contain, perhaps by examining the context of where the item was discovered. (An item found in a folder called Videos is likely to contain video content, for instance.)

You can see a list of unsearchable items with the Get-FailedContentIndexDocuments cmdlet. When you run the cmdlet, you can pass it the name of a server to see all items on a server or just a mailbox database to see the unsearchable items in the content index for that database. For example, this command lists various issues that were encountered in a specific database:

Get-FailedContentIndexDocuments –MailboxDatabase DB2
DocID   Database  Mailbox        Subject   Description
----- -------- ------- ------- -----------
77 DB2 SystemMai… The document parser encountered a processing error.
78 DB2 SystemMai… The document parser encountered a processing error.
1287 DB2 Rob Young The document parser encountered a processing error.

You can see that a number of items in Rob Young’s mailbox have had an issue. Exchange assigns each of the items an identifier (DocId), but there’s no way to extract details for a specific item. Instead, you have to run the cmdlet again, this time using the mailbox parameter to restrict the output to just details for Rob Young’s mailbox. To see additional information, pipe the results to the Format-List cmdlet and then redirect the output to a text file you can then interrogate at your leisure to see what you can discover. The command might look like this:

Get-FailedContentIndexDocuments –Mailbox 'Rob Young' | Format-List > C:\temp\Docs.txt

You can then search through the output text file to see whether anything captured there provides an indication of why an item is unsearchable. For example, you can see in this extract that the parser used to extract the content from an item was unable to complete for some reason and that the item is a Word document.

DocID            : 1667
Database : DB2
MailboxGuid : 4e09fc34-e61a-4eea-87b8-d19b214a92ab
Mailbox : Rob Young
SmtpAddress : Rob.Young@dublin.contoso.com
Subject : RE: 2003/2010 coexistence
ErrorCode : 7
Description : The document parser encountered a processing error.
AdditionalInfo : 309003 Document 'exchange://localhost/Attachment/298057e9-43de-417b-a740-7ab58b6e48bb/eb22b1a1-b1c9-4972-a163-ba508f018d6b/919123003011.1/Transitioning Client Access to Exchange 2013.docx' was partially processed. The parser was not able to parse the entire document.
IsPartialIndexed : False
FailedTime : 01/4/2013 11:27:52

Should you be worried if many unsearchable items exist for your database? It depends. First, it depends on the percentage of unsearchable items. If 0.0002 percent of items are unsearchable, it’s probably acceptable because any search has a very high chance of discovering information that’s required. Second, it depends on the items that are failing to be indexed. If they are all of the same type and a filter is available, you can install that filter to solve the problem. However, if the items are of a type for which a filter is not available or that is known to be unsearchable (such as S/MIME encrypted items), you might have to live with the situation.

Inside Out Controlling the search black box

In most respects, Search Foundation is a black box to Exchange, No user interface is available to control how Exchange interacts with Search Foundation. However, you can affect how Search Foundation operates by making some changes to its parameters held in the system registry (which therefore only apply to an individual server). These settings are found at HKLM\Software\Microsoft\ExchangeServer\V15\Search\SystemParameters and the most important are:

  • MaxAttachmentCount. Default is 10; controls how many attachments to an item will be indexed.

  • MaxAttachmentDepth. Default is 2; controls how many nested levels within a document structure will be indexed.

  • ProcessImages. Default is 0 (zero); controls whether images are indexed.

Normally, a relatively small number of items turn out to be unsearchable. In addition, remember that item metadata (sender, recipients, subject, and so on) and message bodies are always indexed and searchable, so if a small percentage of attachments can’t be searched, it probably won’t be of great concern in a legal search. After all, if people are doing something they shouldn’t, they are likely to leave some trace of their activity in a searchable property that can be discovered. After this happens, the next step is often for investigators to take a complete copy of the suspect’s mailbox to conduct a detailed search to discover what it contains, and any lurking unsearchable items can be reviewed at that time.

Important

An in-place hold depends on the ability of Exchange to understand when an item might satisfy the criteria stated for the hold. Unsearchable items might not expose sufficient information to Exchange for it to assess whether these items should be retained and so create the potential for required items to be removed from mailboxes. If large numbers of unsearchable items are created because of an application you use or other reason, it’s best not to use a query-based hold. Instead, you can create a hold on everything in mailboxes to make absolutely sure that everything that might be required to satisfy a search is available and can be reviewed manually if necessary. This will add a little overhead to the way searches are performed, but it’s the best way to ensure that nothing slips through.

Search syntaxes

Exchange 2010 uses AQS (Advanced Search Syntax) to construct its multimailbox searches. Exchange 2013 takes a different approach and uses KQL (keyword query language). Why the change?

AQS is shared with other Windows search components such as Windows Desktop Search, which Outlook clients use. In fact, Exchange 2010 supports only a subset of the full AQS capabilities. However, KQL is shared with other Office 2013 applications, the most important of which is SharePoint 2013 because the two applications can form a single discovery domain across the email stored in Exchange and the documents held in SharePoint.

Giving Exchange and SharePoint a common search syntax makes great sense and is the driving force behind making the change to search syntax in Exchange 2013. Another advantage is gained in that KQL can perform proximity searches. When you want to search for items that mention the words “Azur project” and have the word “bribe” somewhere close to those words, AQS can certainly find anything that includes “Azur project” AND “bribe,” but it can’t find “Azur project” with “bribe” within 30 words. (In KQL syntax, the word “bribe” is NEAR [n=30] the other phrase.) This capability could be very useful in searches that start by being somewhat imprecise because you’re not quite sure about what you’re looking for. It’s true that searches like this might produce more results than you can deal with on a practical basis, but they could provide a hint about how searches might be refined to home in on the critical items. KQL also supports wildcard searches, meaning that you could use a term such as *toso or cont*, both of which will force the search to find items relating to “Contoso.”

KQL syntax is powerful. It will be interesting to see how it is used to frame search queries as Exchange 2013 is deployed. Even better, the Exchange community can learn KQL tips and techniques to improve searches from those who work with SharePoint and vice versa.

 
Others
 
- Microsoft Exchange Server 2013 : Preserving information (part 9) - Using groups with searches, Removing a search
- Microsoft Exchange Server 2013 : Preserving information (part 8) - How in-place holds work
- Microsoft Exchange Server 2013 : Preserving information (part 7) - Resource throttling for searches
- Microsoft Exchange Server 2013 : Preserving information (part 6) - Examining search results
- Microsoft Exchange Server 2013 : Preserving information (part 5) - Retrieving discovered content
- Microsoft Exchange Server 2013 : Preserving information (part 4) - Creating a new search - Refining a search
- Microsoft Exchange Server 2013 : Preserving information (part 3) - Creating a new search
- Microsoft Exchange Server 2013 : Preserving information (part 2) - Searching mailbox content, In-place holds
- Microsoft Exchange Server 2013 : Preserving information (part 1) - Putting a mailbox on litigation hold
- Microsoft Exchange Server 2013 : How the Managed Folder Assistant implements retention policies (part 2) - Retention date calculation
 
 
Top 10
 
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
programming4us programming4us
 
Popular tags
 
Video Tutorail Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8 BlackBerry Android Ipad Iphone iOS
Celebrity Style, Fashion Trends, Beauty and Makeup Tips.