SQL Server 2008 R2 : Index Utilization

8/16/2013 9:50:26 AM

Now that you have an understanding of table and index structures and the overhead required to maintain your data and indexes, you might want to put things into practice to actually come up with an index design for your database, defining the appropriate indexes to support your queries. To effectively determine the appropriate indexes that should be created, you need to determine whether they’ll actually be used by the SQL Server Query Optimizer. If an index is not being used effectively, it’s just wasting space and creating unnecessary overhead during updates.

The main criterion to remember is that SQL Server does not use an index for the more efficient row locator lookup if at least the first column of the index is not included in a valid search argument (SARG) or join clause. You should keep this point in mind when choosing the column order for composite indexes. For example, consider the following index on the stores table in the bigpubs2008 database:

create index nc1_stores on stores (city, state, zip)

Each of the following queries could use the index because they include the first column, city, of the index as part of the SARG:

select stor_name from stores
   where city = 'Frederick'
     and state = 'MD'
     and zip = '21702'

select stor_name from stores
   where city = 'Frederick'
     and state = 'MD'

select stor_name from stores
   where city = 'Frederick'
     and zip = '21702'

However, the following queries do not use the index for a row locator lookup because they don’t specify the city column as a SARG:

select stor_name from stores
   where state = 'MD'
     and zip = '21702'

select stor_name from stores
   where zip = '21702'

For the index nc1_stores to be used for a row locator lookup in the last query, you would have to reorder the columns so that zip is first—but then the index wouldn’t be useful for any queries specifying only city and/or state. Satisfying all the preceding queries in this case would require additional indexes on the stores table.

Note

For the two preceding queries, if you were to display the execution plan information , you might see that the queries actually use the nc1_stores index to retrieve the result set. However, if you look closely, you can see the queries are not using the index in the most efficient manner; the index is being used to perform an index scan rather than an index seek. An index seek is what we are really after. . In an index seek, SQL Server searches for the specific SARG by walking the index tree from the root level down to the specific row(s) with matching index key values and then uses the row locator value stored in the index key to directly retrieve the matching row(s) from the data page(s); the row locator is either a specific row identifier or the clustered key value for the row.

For an index scan, SQL Server searches all the rows in the leaf level of the index, looking for possible matches. If any are found, it then uses the row locator to retrieve the data row.

Although both seeks and scans use an index, the index scan is still more expensive in terms of I/O than an index seek but slightly less expensive than a table scan, which is why it is used.

You might think that the easy solution to get row locator lookups on all possible columns is to index all the columns on a table so that any type of search criteria specified for a query can be helped by an index. This strategy might be somewhat appropriate in a read-only decision support system (DSS) environment that supports ad hoc queries, but it is not likely because many of the indexes probably still wouldn’t even be used. Just because an index is defined on a column doesn’t mean that the Query Optimizer is necessarily always going to use it if the search criteria are not selective enough. Also, creating that many indexes on a large table could take up a significant amount of space in the database, increasing the time required to back up and run DBCC checks on the database. As mentioned earlier, too many indexes on a table in an OLTP environment can generate a significant amount of overhead during inserts, updates, and deletes and have a detrimental impact on performance.

Tip

A common design mistake often made is too many indexes defined on tables in OLTP environments. In many cases, some of the indexes are redundant or are never even considered by the SQL Server Query Optimizer to process the queries used by the applications. These indexes end up simply wasting space and adding unnecessary overhead to data updates.

A case in point was one client who had eight indexes defined on a table, four of which had the same column, which was a unique key, as the first column in the index. That column was included in the WHERE clauses for all queries and updates performed on the table. Only one of those four indexes was ever used.

Others

- SQL Server 2008 R2 : Data Modification and Performance

- Windows 7 : Understanding VPNs (part 2) - VPN Client and Client Software

- Windows 7 : Understanding VPNs (part 1) - Understanding VPN Encapsulation and Tunneling, Understanding Remote Access VPN Infrastructure

- SharePoint 2010 : ADO.NET Data Services and REST (part 4) - Consuming ADO.NET Data Services in JavaScript

- SharePoint 2010 : ADO.NET Data Services and REST (part 3) - Consuming ADO.NET Data Services in Silverlight

- SharePoint 2010 : ADO.NET Data Services and REST (part 2) - Consuming ADO.NET Data Services in .NET Applications

- SharePoint 2010 : ADO.NET Data Services and REST (part 1) - ADO.NET Data Services and REST Basics

- Managing Windows Server 2012 : Logging Off, Restarting, and Shutting Down, Performing Searches

- Managing Windows Server 2012 : Server 2012's Interface (part 2) - Accessing and Running Management Tools, Customizing the Interface

- Managing Windows Server 2012 : Server 2012's Interface (part 1) - Navigating the Tiled Interface