An unfortunate,
all-too-common indexing approach is to carpet bomb the database with
indexes in the hope that performance will (eventually) improve. Not only
will such an approach fail, but it usually ends in tears with the
accumulated performance and maintenance costs of unnecessary indexes
eventually having a paralyzing effect. Fortunately, there is a much more
effective approach, made all the easier using several index-related
Dynamic Management Views (DMVs).
In this section, we'll approach index analysis
from three angles, identifying indexes to drop, add, and defragment.
Let's begin with a look at using DMVs to identify indexes that are
candidates for removal.
Dynamic Management Views return server state
information useful in diagnosing server health and tuning performance.
The values returned by some DMVs, including the ones in this section,
are reset when the SQL Server instance restarts. Before making any
decisions that are based on DMV results, consider when the instance was
last restarted to ensure the DMV results are representative of the full
range of database access patterns, for example, daily, weekly, and
monthly reports. Taking a database offline, or using the AUTOCLOSE option, will also reset certain DMV values. Links to various DMV scripts and further details are available at http://www.sqlCrunch.com/index.
|
1. Identifying indexes to drop/disable
Indexes that are either not used or used
infrequently not only consume additional space, but they also lengthen
maintenance routines and slow performance, given the need to keep them
updated in line with the base table data. As an example, consider an
index on a FirstName column in a very large customer table and a query
such as select * from customers where FirstName = 'John'. A
lack of understanding as to why SQL Server is unlikely to use such an
index may cause a junior DBA to create it in the hope that it may
improve performance. Usually, such indexes are left in place without any
follow-up analysis as to whether or not the index is being used.
Before we look at techniques for removing
indexes, let's cover a very important DMV, sys.dm_db_index_usage_stats,
to help us in this task.
SYS.DM_DB_INDEX_USAGE_STATS
As the name implies, the
sys.dm_db_index_usage_stats DMV returns information on how indexes are
being used. For each index, counts are kept on the number of times the
index has been scanned, updated, and used for lookup or seek purposes,
since the SQL instance was last restarted.
Let's cover the important ones:
user_seeks—Each
time an index is used for seek purposes, that is, navigating through
levels of its B-tree, this counter is incremented. A high value here
usually represents an effective index.
user_scans—When a index is scanned at the leaf level (as opposed to seeking through the B-tree), this counter is incremented.
user_lookups—Each time a nonclustered index is used for a lookup into a heap or clustered index, this counter is incremented.
user_updates—Insert,
update, and delete operations on a table must maintain the appropriate
indexes. Every insert and delete operation will have a corresponding
action for each nonclustered index, with updates effecting certain
indexes, depending on the columns that have changed. Each time an index
is maintained for any of these actions, this counter is incremented.
In addition to these columns,
sys.dm_db_index_usage_stats returns many others, including the last date
and time for each of the actions covered previously. The major value of
this DMV is using it to identify indexes that are rarely used but have a
high maintenance overhead. Such indexes are candidates for removal, and
we'll walk through a script that uses this DMV shortly.
Unused indexes
Like many other DMVs,
sys.dm_db_index_usage_stats holds index usage stats only since the SQL
instance was last started. One of the implications that can be drawn
from this is that any index that does not appear in this list has not
been used since the instance was started. If the instance has been
online for long enough to cover the full range of access patterns—for
example, daily, weekly and monthly reports—and the database is not using
the AUTOCLOSE option (or has been taken offline), then an index not appearing in this DMV is unlikely to be used at all.
The script in listing 1
uses sys.dm_db_index_usage_stats along with a number of system tables
to return indexes not used since the last instance restart.
Example 1. Indexes not used since last instance restart
-- Identify unused indexes (since last restart)
SELECT
sc.name + '.' + object_name(i.object_id) as objectName
, i.name as indexName
, i.type_desc as indexType
FROM sys.indexes i
INNER JOIN sys.objects o on o.object_id = i.object_id
INNER JOIN sys.schemas sc on o.schema_id = sc.schema_id
WHERE
objectproperty(i.object_id,'IsUserTable') = 1
AND i.index_id not in (
SELECT s.index_id
FROM sys.dm_db_index_usage_stats s
WHERE
s.object_id = i.object_id
AND i.index_id = s.index_id
AND database_id = db_id()
)
ORDER BY objectName, indexName ASC
|
This script uses sys.indexes, sys.objects, and sys.schemas to return index information followed by a simple not in clause to exclude those indexes in the DMV. The end result includes indexes not used since the SQL instance last started.
We also need to identify indexes whose maintenance cost outweighs their value.
High-maintenance/low-value indexes
In our next script, shown in listing 2,
let's use the count values returned from sys.dm_db_index_usage_stats to
compare the update count to the use count, that is, their
maintenance-versus-usage ratio.
Example 2. High-maintenance/low-value indexes
-- Identify high maintenance indexes with low usage
SELECT
sc.name + '.' + object_name(i.object_id) as objectName
, i.name as indexName
, user_seeks
, user_scans
, user_lookups
, (user_seeks + user_scans + user_lookups) as indexReads
, user_updates as indexWrites
, user_updates - (user_seeks + user_scans + user_lookups) as usageDiff
FROM sys.dm_db_index_usage_stats s
INNER JOIN sys.indexes i on i.index_id = s.index_id
INNER JOIN sys.objects o on o.object_id = i.object_id
INNER JOIN sys.schemas sc on o.schema_id = sc.schema_id
WHERE
database_id = db_id()
AND objectproperty(s.object_id,'IsUserTable') = 1
AND i.object_id = s.object_id
AND i.type_desc = 'NONCLUSTERED'
ORDER BY
usageDiff DESC
|
By ordering the results descending on the
usageDiff column, this script identifies indexes with the biggest
differential between their read and write counts. In the extreme cases
where indexReads is zero, the index is being maintained for no read
benefits at all. Note that the previous script filters for nonclustered
indexes. We can remove this condition to display information on the base
table/clustered index as well, including the last accessed stats to
display when a table was last used in any capacity.
The result of this script, run against the AdventureWorks database, is shown in figure 1. Note that negative values for usageDiff represent cases where the index has been used for read purposes more than for updates.
It's also possible for some indexes, while valuable and frequently used, to be duplicated or overlapping.
Duplicate and overlapping indexes
A duplicate index, that is, an index with the
same columns, defined in the same order, presents a very easy and clear
case for removal. In a similar manner, an index that shares the same
leading columns can also be considered a duplicate of another index. For
example, Index B on lastName, firstName is really a duplicate of Index A
on lastName, firstName, dateOfBirth.
Before dropping duplicate indexes, you should
take care to ensure that no existing T-SQL or stored procedure code uses
the index(es) as part of an index hint. Dropping such indexes without
updating referencing code will obviously result in errors when the (now
nonexistent) indexes are referenced.