Now that you have an understanding of table and index
structures and the overhead required to maintain your data and indexes,
you might want to put things into practice to actually come up with an
index design for your database, defining the appropriate indexes to
support your queries. To effectively determine the appropriate indexes
that should be created, you need to determine whether they’ll actually
be used by the SQL Server Query Optimizer. If an index is not being used
effectively, it’s just wasting space and creating unnecessary overhead
during updates.
The main criterion to remember is that SQL Server
does not use an index for the more efficient row locator lookup if at
least the first column of the index is not included in a valid search
argument (SARG) or join clause. You should keep this point in mind when
choosing the column order for composite indexes. For example, consider
the following index on the stores table in the bigpubs2008 database:
create index nc1_stores on stores (city, state, zip)
Each of the following queries could use the index because they include the first column, city, of the index as part of the SARG:
select stor_name from stores
where city = 'Frederick'
and state = 'MD'
and zip = '21702'
select stor_name from stores
where city = 'Frederick'
and state = 'MD'
select stor_name from stores
where city = 'Frederick'
and zip = '21702'
However, the following queries do not use the index for a row locator lookup because they don’t specify the city column as a SARG:
select stor_name from stores
where state = 'MD'
and zip = '21702'
select stor_name from stores
where zip = '21702'
For the index nc1_stores to be used for a row locator lookup in the last query, you would have to reorder the columns so that zip is first—but then the index wouldn’t be useful for any queries specifying only city and/or state. Satisfying all the preceding queries in this case would require additional indexes on the stores table.
Note
For the two preceding queries, if you were to display the execution plan information , you might see that the queries actually use the nc1_stores
index to retrieve the result set. However, if you look closely, you can
see the queries are not using the index in the most efficient manner;
the index is being used to perform an index scan rather than an index
seek. An index seek is what we are really after. . In an index seek,
SQL Server searches for the specific SARG by walking the index tree
from the root level down to the specific row(s) with matching index key
values and then uses the row locator value stored in the index key to
directly retrieve the matching row(s) from the data page(s); the row
locator is either a specific row identifier or the clustered key value
for the row.
For an index scan,
SQL Server searches all the rows in the leaf level of the index, looking
for possible matches. If any are found, it then uses the row locator to
retrieve the data row.
Although both seeks and scans use an index, the index
scan is still more expensive in terms of I/O than an index seek but
slightly less expensive than a table scan, which is why it is used.
You might think that the easy solution to get row
locator lookups on all possible columns is to index all the columns on a
table so that any type of search criteria specified for a query can be
helped by an index. This strategy might be somewhat appropriate in a
read-only decision support system (DSS) environment that supports ad hoc
queries, but it is not likely because many of the indexes probably
still wouldn’t even be used. Just because an index is defined on a column
doesn’t mean that the Query Optimizer is necessarily always going to use
it if the search criteria are not selective enough. Also, creating that
many indexes on a large table could take up a significant amount of
space in the database, increasing the time required to back up and run DBCC
checks on the database. As mentioned earlier, too many indexes on a
table in an OLTP environment can generate a significant amount of
overhead during inserts, updates, and deletes and have a detrimental
impact on performance.
Tip
A common design mistake often made is too many
indexes defined on tables in OLTP environments. In many cases, some of
the indexes are redundant or are never even considered by the SQL Server
Query Optimizer to process the queries used by the applications. These
indexes end up simply wasting space and adding unnecessary overhead to
data updates.
A
case in point was one client who had eight indexes defined on a table,
four of which had the same column, which was a unique key, as the first
column in the index. That column was included in the WHERE clauses for all queries and updates performed on the table. Only one of those four indexes was ever used.