Sharepoint 2010 : Planning Your Search Deployment - Performance (part 1) - Performance Reports

12/26/2013 1:30:35 AM

Although performance is usually not a major consideration during initial software implementations, it is a critical aspect to consider. Most often, performance is in the back of the administrator's mind, and assumptions about how fast the responses will be are made without consideration of environmental variables and infrastructure limitations. Administrators are often surprised when pages load slowly, documents take seemingly forever to download, or searches are returned only after several seconds.

Search performance is additionally dependent on different factors than content display, and its requirements to perform can often be overlooked. There are unique areas that need to be considered to make search perform, and giving them proper consideration prior to deploying SharePoint Search implementation may help avoid user dissatisfaction at a later stage.

In most cases, the target for good search performance is that search result pages, under peak search load, are finished rendering in less than one second. Search should take no longer to complete and be actioned than any regular page. Also, as users will become accustomed to using the result page to some extent as an information source or an information target from where they can check out, edit, or share documents in and of itself, search should be seen as a core content display component.

There are two main areas of performance for a search engine:

Query latency: Query latency is the time it takes the query servers to receive a search request from the web server, check the index partitions for the given query terms, gather the result data from the search databases into a result set, and return it to the web server. Additionally, the web server must render this result set in the result page and display it with any special formatting, custom design, or custom Web Parts. Although the rendering and display mechanism is not strictly part of the query servers' performance, users will not differentiate when their result set is returned slower than expected.
There are a number of factors that can attribute to high query latency. These include slow hardware, or more often hardware with insufficient resources, a slow network connection, large indexes, and a high volume of queries. Monitoring these factors and pinpointing the bottlenecks can help determine and remedy the source of poor query times.
Indexing speed: How fast the crawl server can gather documents determines how fresh the content returned in search queries is. It also will determine if security values are updated in the result list. Users may make the assumption that search is a representation of the documents at any point and expect that the latest version of documents is searchable. Luckily, SharePoint 2010 has an excellent incremental crawl mechanism that will keep refreshing the documents on schedule without search downtime. However, poor performance on search indexing could mean that initial crawls take days to weeks to complete. Poor SharePoint performance in general could mean that incremental indexing during business hours could be undesirable. It is always desirable to scale the deployment to allow for better performance and availability as opposed to disabling or limiting indexing frequency. Of course, infrastructure and budget limitations must be balanced with business needs.

1. Performance Reports

Measuring performance in SharePoint 2010 has been made much easier with the addition of SharePoint health reports. The reports are in the Administrative Report Library, which can be found in the Central Administration under Manage Service Applications => Search Service Application => Search Administration => Administrative Reports. There are several reports to help diagnose performance issues. The starting point for investigating performance issues and the key performance reports available are as follows:

Query Latency: Query latency is the total time it takes for queries to be processed and returned to the web servers. The Query Latency report shows how long queries take to be returned. All queries are not equal, and queries for more common terms with larger result sets will generally take longer to process, assemble result sets, and return. The Query Latency report shows the different areas and how long it takes for the query to be processed in each area: the server rendering, at the object model, and on the back end (databases). More information can be found in the reporting section. See Figure 1.

Figure 1. Query Latency report

Query Latency Trend: Latency is not a static consideration, so seeing the progression of performance and any increase or decline in performance over time can be useful. Also, it is wise to associate the given performance decline with other factors that may be affecting it, such as index size or crawling behavior, as well as other demands on key components, such as shared database servers. The Query Latency Trend report will show query latency over a given time period and display the latency for scaling proportions of the queries. So, the administrator can see the latency of all queries in the 95th percentile of all queries and in the 90th, 80th, 70th, 50th, and 5th percentiles. This gives the administrator the ability to see how quickly the queries are returned and how this scales. The crawl rate can also be seen in comparison for simultaneous periods. This will allow the administrator to determine if sub-second results are being achieved for the majority of queries. A best practice is to keep the 90th percentile of searches below 1,000 milliseconds (1 second), and the crawler should return between 200 and 400 items per second.

Crawl Rate per Content Source: The crawl rates can be seen in this report and will give an indication of how fast the crawler is collecting content. Although it will probably collect information fast enough for most organizations, the crawler can have issues with particular documents or at particular times, and this data can be seen here. See Figure 2.

Figure 2. Crawl Rate per Content Source report

Crawl Rate per Type: Some areas take more resources to crawl and the Crawl Rate per Type report can show this information. This information may be unique to an organization with a lot of document modification or a heavy check-in/check-out rate. See Figure 3.

Figure 3. Crawl Rate per Type report

Others

- Windows Server 2008 : Starting and Using PowerShell - Using Comparison Operators, Understanding Pipelining

- Windows Server 2008 : Starting and Using PowerShell - Understanding PowerShell Variables

- Windows Server 2008 : Starting and Using PowerShell - Redirecting Output with Windows PowerShell, Understanding PowerShell Errors

- Windows Server 2008 : Starting and Using PowerShell - Exploring get-member

- Windows Server 2008 : Starting and Using PowerShell - Creating Aliases, Discovering Windows PowerShell Commands

- Exchange Server 2010 : Managing Mailbox Databases (part 2) - Properties of a Mailbox Database

- Exchange Server 2010 : Managing Mailbox Databases (part 1) - Viewing Mailbox Databases, Creating Mailbox Databases

- Exchange Server 2010 : Mailbox Storage - Determining the Number of Databases, Allocating Disk Drives

- Exchange Server 2010 : Getting to Know Exchange Database Storage (part 2)

- Exchange Server 2010 : Getting to Know Exchange Database Storage (part 1)