Creating indexes on tables or views
provides fast access to data by allowing the data to be organized in a
way that allows for optimum query performance. You can think of an
index within SQL Server just like the index of a book.
The two most popular types of indexes are clustered and nonclustered. A clustered index
physically changes the actual data pages stored in the database files
themselves. This allows SQL Server to quickly go directly to the page
where the requested data resides. Since we are physically ordering the
data, you can have only one clustered index per table. If you wanted to
optimize other queries that might not leverage the clustered index, you
could create a nonclustered index. A nonclustered index is a separate data structure that keeps pointers to the actual data pages instead of physically changing the data page itself.
To help illustrate the value of the index, let’s use the following script to create a Products
table. This table will have an id
column and a column for the price of the product. The setup script is as follows:
CREATE TABLE Products
(product_id INT IDENTITY(1,1) NOT NULL,
product_price DECIMAL(9,2) NOT NULL)
Note that you are not going to create a primary key on the product_id
column. Instead, the IDENTITY
property will be used to ensure every insert into the Products
table has a unique value. You are not creating a primary key on the product_id
column, because when you create a primary key, SQL Server creates a
clustered index for that given key. This clustered index for the
primary key would adversely affect your index versus nonindex
performance results.
Now that the table is created, you can use the following script to create 100,000 test values:
DECLARE @i INT;
DECLARE @price DECIMAL(9,2);
SET @i=0;
WHILE (@i<100000)
BEGIN
SET @price= ROUND((RAND()*1000),2)
INSERT INTO Products(product_price) VALUES (@price)
SET @i=@i+1
END
The previous script will create 100,000 different prices ranging from 0 to less than 1,000.
Note This query may take a few minutes to run.
At this point, you have not created any indexes on the Products
table. To easily determine whether SQL Server is performing a table
scan or using an index, you can click the Include Actual Execution Plan
button shown in Figure .
Alternatively, you can include the actual execution plan by hitting
Ctrl+M or selecting Include Actual Execution Plan from the Query menu
in SSMS.
Figure 1. Include Actual Execution Plan button
When you include the actual execution plan,
SSMS will add an extra tab called Execution Plan to the Results pane.
The query results will still be displayed on the Results tab, but you
can view the execution plan that SQL Server’s query optimizer generated
using the Execution Plan tab.
Now, let’s find all the products that cost between 400 and 700 using the following query:
SELECT COUNT(product_id) FROM Products WHERE
product_price BETWEEN 400 AND 700
When you click the Execution Plan tab in the Results pane, you will see something similar to Figure 2.
Figure 2. Execution plan showing a table scan
First, SQL Server performed a table scan. This means that it scanned a
large portion of the 100,000 rows to satisfy this query. The second
important information is the cost of the query. The estimated subtree
cost was .321919 for this query.
Since you are querying based on the product_price
column, create a clustered index on the column by issuing the following T-SQL script:
CREATE CLUSTERED INDEX CI_Price ON Products(product_price)
Now, when you reissue the same query:
SELECT COUNT(product_id) FROM Products WHERE
product_price BETWEEN 400 AND 700
the execution plan will show something similar to Figure 3.
Figure 3. Execution plan showing clustered index seek
In this figure, you can see that a clustered
index seek was performed instead of a table scan. Second, the estimated
cost of the subtree was only .11147. This query executed more than
twice as fast as the query without the index.
Note
Even though an index may be defined for a given table or view, the SQL
Server query optimizer may find it more efficient to just do a table
scan for smaller row sizes instead of leveraging the index.
In our example, you created a
clustered index. This type of index changes the actual data pages
stored in the database. Thus, you can have only one clustered index per
table. If you wanted to optimize other queries that might not leverage
the clustered index, you could create a nonclustered index. Recall that
a nonclustered index is a separate data structure that keeps pointers
to the actual data pages instead of physically changing the data page
itself.