Beyond the standard clustered
and nonclustered indexes, SQL Server offers two type of indexes
referred to as specialty indexes. Filtered indexes, which were new in
SQL Server 2008, include less data; and indexed views, available since
SQL Server 2000, build out custom sets of data. Both are considered
high-end performance tuning indexes.
1. Filtered Indexes
A nonclustered index contains a record
for every record in the base table on which it is defined.
Historically, this has always been a 1-to-1 relationship. SQL Server
2008 introduced the concept of a filtered index. With a filtered index,
you can set a predicate on your CREATE INDEX
statement so that the index contains only rows that meet the criteria
you set. This option is only available for nonclustered indexes because
a clustered index IS the table, so it wouldn't make sense to
allow you to filter on a clustered index. Because a filtered index can
potentially have much fewer records than the traditionally nonclustered
index, they tend to be much smaller in size. In addition, because the
index has fewer records, the statistics on these indexes tend to be
more accurate, which can lead to better execution plans.
An example of employing a filtered index in AdventureWorks2012 is the ScrappedReasonID column in the Production.WorkOrder table. Fortunately, for AdventureWorks, they scrapped only 612 (.8 percent) parts over the life of the database. The existing IX_WorkOrder_ScrapReasonID includes every row. The ScrapReasonID foreign key in the Production.WorkOrder
table enables nulls for work orders that were not scrapped. The index
includes all the null values with pointers to the workorder rows with null ScrapReasonIDs. The current index uses 109 pages.
The following script re-creates the index with a WHERE clause that excludes all the null values:
DROP INDEX Production.WorkOrder.IX_WorkOrder_ScrapReasonID
CREATE INDEX IX_WorkOrder_ScrapReasonID
ON Production.WorkOrder(ScrapReasonID)
WHERE ScrapReasonID IS NOT NULL
The new index uses only two pages. Interestingly,
the difference isn't noticeable between using the filtered or
nonfiltered index when selecting all the work orders with a scrap
reason that's not null. This is because there aren't enough
intermediate levels to make a significant difference. For a much larger
table, the difference would be worth testing, and most likely the
filtered index would provide a benefit.
Best Practice
When designing a covering index (see
Query Path #6) to solve a specific query — probably one that is in the
top handful of CPU duration according to the indexing strategy — if the
covering index works with a relatively small subset of data, and the
overall table is a large table, consider using a filtering covering
index.
Another situation that might benefit from
filtered indexes is building a unique index that includes multiple rows
with null values. A normal unique index enables only a single row to
include a null value in the key columns. However, building a unique
index that excludes null in the WHERE clause creates a unique index that permits an unlimited number of null values.
In a sense, SQL Server has had filtered indexes
since SQL Server 2000 with indexed views. There's no reason why an
indexed view couldn't have included a WHERE
clause and included data from a filtered set of rows. But filtered
indexes are certainly easier to create than indexed views, and they
function as normal nonclustered indexes — which is an excellent segue
into the next topic, indexed views.
2. Indexed Views
When a denormalized and pre-aggregated
data solution needs to be in real time, an excellent alternative to
querying the base tables includes using indexed views. Indexed views
are “materialized” in that when the base table is updated the index for
the view is also updated. This stores pre-aggregated or deformalized
data without using special programming methods to do so.
Instead of building tables to duplicate data and
denormalize a join, a view can be created that can join the two
original tables and include the two source primary keys and all the
columns required to select the data. Building a clustered index on the
view physically materializes every column in the select column list of
the view.
Numerous restrictions exist on indexed views, including the following:
- The ANSI null and quoted identifier must be enabled when the base
tables are created, when the view is created, and when any connection
attempts to modify any data in the base tables.
- The index must be a unique clustered index; therefore, the view must produce a unique set of rows without using distinct.
- The tables in the view must be tables (not nested views) in the
local database and must be referenced by means of the two-part name
(schema.table).
- The view must be created with the option with schema binding.
As an example of an indexed view used to denormalize a large query, the following view selects data from the Product, WorkOrder, and ScrapReason tables to produce a view of scrapped products:
USE AdventureWorks2012
SET ANSI_Nulls ON;
SET ANSI_Padding ON;
SET ANSI_Warnings ON;
SET ArithAbort ON;
SET Concat_Null_Yields_Null ON;
SET Quoted_Identifier ON;
SET Numeric_RoundAbort OFF;
GO
CREATE VIEW vScrap
WITH SCHEMABINDING
AS
SELECT WorkOrderID, P.Name AS Product,
P.ProductNumber,
S.Name AS ScrappedReason, ScrappedQty
FROM Production.WorkOrder W
JOIN Production.Product P
ON P.ProductID = W.ProductID
JOIN Production.ScrapReason S
ON W.ScrapReasonID = S.ScrapReasonID
With the view in place, the index can now be created on the view, resulting in an indexed view:
CREATE UNIQUE CLUSTERED INDEX ivScrap
ON vScrap (WorkOrderID, Product, ProductNumber,
ScrappedReason, ScrappedQty) ;
Indexed views can also be listed and created in Management Studio under the Views → Indexes node.
To drop an indexed view, the drop statement must refer to the view instead of to a table:
DROP INDEX ivscrap ON dbo.vScrap
Dropping the view automatically drops the indexed view created from the view.
Indexed Views and Queries
When SQL Server's Query Optimizer
develops the execution plan for a query, it includes the indexed view's
clustered index as one of the indexes it can use for the query, even if
the query doesn't explicitly reference the view. This happens only with
the Enterprise Edition.
This means that the indexed view's clustered
index can serve to speed up queries. When the Query Optimizer selects
the indexed view's clustered index, the query execution plan indicates
the index used. Both of the following queries use the indexed view:
SELECT WorkOrderID, P.Name AS Product,
P.ProductNumber,
S.Name AS ScrappedReason, ScrappedQty
FROM Production.WorkOrder W
JOIN Production.Product P
ON P.ProductID = W.ProductID
JOIN Production.ScrapReason S
ON W.ScrapReasonID = S.ScrapReasonID
SELECT * FROM vScrap
Although indexed views are essentially the same
as they were in SQL Server 2000, the Query Optimizer can now use
indexed views with more types of queries.
Updating Indexed Views
As with any denormalized copy of the
data, the difficulty is keeping the data current. Indexed views have
the same issue. As data in the underlying base tables is updated, the
indexed view must be kept in sync. This process is completely
transparent to the user and is more of a performance consideration than
a programmatic issue.
3. The Columnstore Index
New to SQL Server 2012 is the
Columnstore index, which is structured and behaves differently than
traditional B-tree indexes. With B-tree indexes, column values are
stored in rows in a page. However, with Columnstore indexes, the column
values are stored together in segments, which allows for incredible
data compression while dramatically increasing speed of processing for
scanning huge amounts of data.
The major benefit for Columnstore indexes are for
DataWarehouse style queries where a large Fact table is joined with
smaller Dimension tables and data is scanned to produce the wanted
result. When a Columnstore index is added to a table, it causes the
table to be readonly until the index is dropped, which is also ideal
for DataWarehouse tables because they are often updated infrequently
and at defined intervals.
Consider the following script to create a
Columnstore index on the dbo.FactFinance table in the AdventureWorks
DataWarehouse sample database. The syntax should be familiar; it is
practically the same as creating a nonclustered index. The only
different is that in a traditional NC index, you only include the
columns that you use in the index, and you need to be careful about the
order in which you define those columns. However, in a Columnstore
index the recommendation is to include every column in the table, the
order in which you define the columns in the index does not matter.
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_CS_Finance
ON dbo.FactFinance
(
FinanceKey, DateKey, OrganizationKey, DepartmentGroupKey,
ScenarioKey, AccountKey, Amount, Date
)