Query Path 9 — Filter by Unordered Composite Index
One common indexing myth is that the
order of the index key columns doesn't matter, that is, SQL Server can
use an index so long as the column is anywhere in the index. Like most
myths, it's a half truth.
Searching an index requires the leading index key
column to be present in the search predicate. Searching for col1, col2
works great when the index includes col1 as the leading index key with
col2 following it. However, searching solely for col2 without col1 in
the predicate requires scanning all the leaf level data if another
suitable index is not present.
Query Path 9 demonstrates the inefficiency of filtering on an index column that is not the leading index key column.
StartDate is the second key in the composite index, so the data is there. Will the query use the index?
SELECT WorkOrderID
FROM Production.WorkOrder
WHERE StartDate = ‘2006-01-04';
The query optimizer uses the IX_WorkOrder_ProductID composite nonclustered index, as shown in Figure 12,
because it's narrower than the clustered index, so more rows fit on a
page. Because the filter is by the second column, it can't use the
index; instead SQL Server is forced to scan every row and filter (in
the scan operation) to select the correct rows. Essentially, it's doing
the same operation as manually scanning a telephone book for everyone
with a first name of Tim.
Query Path 10 — Non-SARG-Able Expressions
SQL Server's Query Optimizer examines the conditions within the query's predicates to determine which indexes are useful. If SQL Server can optimize the criteria statements, such as a WHERE clause, using an index, the condition is referred to as a search argument (SARG). However, not every condition is a “SARG-able” search argument:
The final query path walks through a series of antipatterns, designing WHERE
clauses with conditions that can't use index seek operations for one or
more reasons. The result is an index scan, when an index seek is more
advantageous. The following is a list of common types of
“non-SARG-able” expressions:
- Including the table search column in an expression forces SQL
Server to evaluate the outcome of the expression for every row before
it can determine if the row passes the WHERE clause criteria:
SELECT WorkOrderID
FROM Production.WorkOrder
WHERE ProductID + 2 = 759;
- The solution to this non-SARG-able issue is to rewrite the query so
that the expression is no longer dependent on the table column.
SELECT WorkOrderID
FROM Production.WorkOrder
WHERE ProductID = 759 - 2;
- Multiple inclusive criteria is typically SARG-able; however, the
optimizer may have a more difficult time creating a seekable plan with
criteria composed of OR logic.
SELECT WorkOrderID, StartDate
FROM Production.WorkOrder
WHERE ProductID = 757
OR StartDate = ‘2006-01-04';
- Negative search conditions (<>, !>, !<, Not Exists, Not In, Not Like) are not easily optimized. It's easy to prove that a row exists, but to prove it doesn't exist requires examining every row.
SELECT WorkOrderID, StartDate
FROM Production.WorkOrder
WHERE ProductID NOT IN (400,800, 950);
- It is possible that exclusive criteria can be SARG-able, so it's
worth testing. Often, it's the number of rows returned that forces a
scan, not the exclusive criteria.
- Search predicates that begin with wildcards aren't SARG-able. An index can quickly locate WorkOrderID = 757, but must scan every row to find any WorkOrderID's ending in 7:
SELECT WorkOrderID, StartDate
FROM Production.WorkOrder
WHERE WorkOrderID like ‘%7';
- If the predicate includes a
function, such as a string function, a scan is required so that every
row can be evaluated with the function before the final criteria is
applied to the function output.
SELECT WorkOrderID, StartDate
FROM Production.WorkOrder
WHERE DateName(dw, StartDate) = ‘Monday';
SQL Server 2008 does include some optimizations
that can avoid the scan when working with the Date data type when
conversions are included in the predicate
Note
The type of access (index scan
versus index seek) not only impacts the performance of reading data
from the single table, but it also impacts join performance. The type
of join chosen by SQL Server depends on whether the data is ordered
(among other things). Merge joins require ordered result sets as
inputs. If the optimizer determines that a merge join is the most
efficient join method to satisfy a query, a sort operation may be
required to sort the inputs. In such a case, a memory grant is required
and potentially tempdb space to store the intermediate result sets.
This is another example of why indexing is important.