Although performance is usually not a
major consideration during initial software implementations, it is a
critical aspect to consider. Most often, performance is in the back of
the administrator's mind, and assumptions about how fast the responses
will be are made without consideration of environmental variables and
infrastructure limitations. Administrators are often surprised when
pages load slowly, documents take seemingly forever to download, or
searches are returned only after several seconds.
Search performance is additionally dependent on
different factors than content display, and its requirements to perform
can often be overlooked. There are unique areas that need to be
considered to make search perform, and giving them proper consideration
prior to deploying SharePoint Search implementation may help avoid user
dissatisfaction at a later stage.
In most cases, the target for good search
performance is that search result pages, under peak search load, are
finished rendering in less than one second. Search should take no
longer to complete and be actioned than any regular page. Also, as
users will become accustomed to using the result page to some extent as
an information source or an information target from where they can
check out, edit, or share documents in and of itself, search should be
seen as a core content display component.
There are two main areas of performance for a search engine:
- Query latency: Query latency is the time it takes the query
servers to receive a search request from the web server, check the
index partitions for the given query terms, gather the result data from
the search databases into a result set, and return it to the web
server. Additionally, the web server must render this result set in the
result page and display it with any special formatting, custom design,
or custom Web Parts. Although the rendering and display mechanism is
not strictly part of the query servers' performance, users will not
differentiate when their result set is returned slower than expected.
- There are a number of factors that can attribute to high query
latency. These include slow hardware, or more often hardware with
insufficient resources, a slow network connection, large indexes, and a
high volume of queries. Monitoring these factors and pinpointing the
bottlenecks can help determine and remedy the source of poor query
times.
- Indexing speed: How fast the crawl server can gather
documents determines how fresh the content returned in search queries
is. It also will determine if security values are updated in the result
list. Users may make the assumption that search is a representation of
the documents at any point and expect that the latest version of
documents is searchable. Luckily, SharePoint 2010 has an excellent
incremental crawl mechanism that will keep refreshing the documents on
schedule without search downtime. However, poor performance on search
indexing could mean that initial crawls take days to weeks to complete.
Poor SharePoint performance in general could mean that incremental
indexing during business hours could be undesirable. It is always
desirable to scale the deployment to allow for better performance and
availability as opposed to disabling or limiting indexing frequency. Of
course, infrastructure and budget limitations must be balanced with
business needs.
1. Performance Reports
Measuring performance in SharePoint 2010 has
been made much easier with the addition of SharePoint health reports.
The reports are in the Administrative Report Library, which can be
found in the Central Administration under Manage Service Applications => Search Service Application => Search Administration =>
Administrative Reports. There are several reports to help diagnose
performance issues. The starting point for investigating performance
issues and the key performance reports available are as follows:
Query Latency: Query latency is the
total time it takes for queries to be processed and returned to the web
servers. The Query Latency report shows how long queries take to be
returned. All queries are not equal, and queries for more common terms
with larger result sets will generally take longer to process, assemble
result sets, and return. The Query Latency report shows the different
areas and how long it takes for the query to be processed in each area:
the server rendering, at the object model, and on the back end
(databases). More information can be found in the reporting section.
See Figure 1.
Figure 1. Query Latency report
Query Latency Trend: Latency is not a
static consideration, so seeing the progression of performance and any
increase or decline in performance over time can be useful. Also, it is
wise to associate the given performance decline with other factors that
may be affecting it, such as index size or crawling behavior, as well
as other demands on key components, such as shared database servers.
The Query Latency Trend report will show query latency over a given
time period and display the latency for scaling proportions of the
queries. So, the administrator can see the latency of all queries in
the 95th percentile of all queries and in the 90th, 80th, 70th, 50th,
and 5th percentiles. This gives the administrator the ability to see
how quickly the queries are returned and how this scales. The crawl
rate can also be seen in comparison for simultaneous periods. This will
allow the administrator to determine if sub-second results are being
achieved for the majority of queries. A best practice is to keep the
90th percentile of searches below 1,000 milliseconds (1 second), and
the crawler should return between 200 and 400 items per second.
Crawl Rate per Content Source: The crawl
rates can be seen in this report and will give an indication of how
fast the crawler is collecting content. Although it will probably
collect information fast enough for most organizations, the crawler can
have issues with particular documents or at particular times, and this
data can be seen here. See Figure 2.
Figure 2. Crawl Rate per Content Source report
Crawl Rate per Type: Some areas take
more resources to crawl and the Crawl Rate per Type report can show
this information. This information may be unique to an organization
with a lot of document modification or a heavy check-in/check-out rate.
See Figure 3.
Figure 3. Crawl Rate per Type report