SQL Server 2012 : The Path of the Query (part 1) - Fetch All, Clustered Index Seek

11/6/2013 8:30:14 PM

A good way to understand how to design efficient indexes is to observe and learn from the various possible paths' queries use to locate data using indexes.

The following section compares and contrasts ten different query paths. Not every query path is an efficient query path.

A good test table for observing the 10 query paths in the AdventureWorks2012 database is the Production.WorkOrder table. It has 72,591 rows, 10 columns, and a single-column clustered primary key. Here's the table definition:

CREATE TABLE [Production].[WorkOrder](
[WorkOrderID] [int] IDENTITY(1,1) NOT NULL,
[ProductID] [int] NOT NULL,
[OrderQty] [int] NOT NULL,
[StockedQty] AS (isnull([OrderQty]-[ScrappedQty],(0))),
[ScrappedQty] [smallint] NOT NULL,
[StartDate] [datetime] NOT NULL,
[EndDate] [datetime] NULL,
[DueDate] [datetime] NOT NULL,
[ScrapReasonID] [smallint] NULL,
[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_WorkOrder_WorkOrderID] PRIMARY KEY CLUSTERED 
 ([WorkOrderID] ASC) 
  WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
  ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];

The WorkOrder table has three indexes, each with one column as identified in the index name:

PK_WorkOrder_WorkOrderID (Clustered)
IX_WorkOrder_ProductID (Nonunique, Nonclustered)
IX_WorkOrder_ScrapReasonID (Nonunique, Nonclustered)

Performance data for each path, listed in Table 1, was captured by watching the T-SQL ? SQL:StmtCompleted and Performance ? Showplan XML Statistics Profile events in Profiler and examining the Query Execution Plan.

Table 1 Query Path Performance

The key performance indicators are the query execution plan optimizer costs (Cost), and the number of logical reads (Reads).

For the duration column, each query path was executed multiple times with the results averaged. You should run the script on your own SQL Server instance, take your own performance measurements, and study the query execution plans.

The rows-per-ms column is calculated from the number of rows returned and the average duration. Before executing each query path, the following code clears the buffers:

DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;

Query Path 1 — Fetch All

The first query path sets a baseline for performance by simply requesting all the data from the base table.

SELECT *
 FROM Production.WorkOrder;

Without a where clause and every column selected, the query must read every row from the clustered index. A clustered index scan (shown in Figure 1) sequentially reads every row.

Figure 1 The clustered index scan sequentially reads all the rows from the clustered index.

This query is the longest query of all the query paths, so it might seem to be a slow query, however, when comparing the number of rows returned per millisecond, the index scan returns the highest number of rows per millisecond of any query path.

Query Path 2 — Clustered Index Seek

The second query path adds a where clause to the first query and filters the result to a single row using a clustered key value:

SELECT *
 FROM Production.WorkOrder
 WHERE WorkOrderID = 1234;

The query optimizer has two clues that there's only one row that meets the where clause criteria: Statistics and that WorkOrderID is the primary key constraint, so it must be unique. WorkOrderID is also the clustered index key, so the query optimizer knows there's a great index available to locate a single row. The clustered index seek operation navigates the clustered index B-tree and quickly locates the desired row, as shown in Figure 2.

Figure 2 A clustered index seek navigates the B-tree index and locates the row efficiently.

Conventional wisdom holds that this is the fastest possible query path, and it is snappy when returning a single row; however, from rows returned on a per millisecond basis, it's one of the slowest query paths.

A common myth is that seeks can return only single rows, and that's why seeking multiple rows would be slow compared to scans. As the next two query paths indicate, that's not true.

Others

- SQL Server 2012 : Indexing Basics (part 2) - Index Selectivity, Query Operators

- SQL Server 2012 : Indexing Basics (part 1) - The B-Tree Index, Clustered Indexes, Nonclustered Indexes

- Windows 7 : Using Internet Explorer 8 - Using Multimedia Browsing and Downloading (part 3)

- Windows 7 : Using Internet Explorer 8 - Using Multimedia Browsing and Downloading (part 2)

- Windows 7 : Using Internet Explorer 8 - Using Multimedia Browsing and Downloading (part 1)

- Windows Server 2012 : Highly available, easy-to-manage multi-server platform - Management efficiency (part 3) - PowerShell 3.0

- Windows Server 2012 : Highly available, easy-to-manage multi-server platform - Management efficiency (part 2) - Simplified Active Directory administration

- Windows Server 2012 : Highly available, easy-to-manage multi-server platform - Management efficiency (part 1) - The new Server Manager

- Windows Server 2012 : Highly available, easy-to-manage multi-server platform - Cost efficiency - Storage Spaces

- System Center Configuration Manager 2007 : Client Management - Client Discovery (part 2) - Network Discovery