Analysing Indexes Part 1

Actually what started as a small document has continued to grow.  I'm working on performance tuning a well known application and as I document as I work so others can follow what I'm doing I thought I'd add it to my blog. There's nothing particularly startling in here but if you want to drill into index performance on your database this may help. As I remarked on Tony or maybe Simon's blog it's fine looking at small tables but when row counts are in hundreds of millions and secondary indexes hit 10Gb ( each ) then performance and storage become more critical - and difficult to test - it's not a quick task to drop and create a handful of indexes on a 250 million row table.

  • SQL Server 2005 maintains system views which contain information concerning indexes.
  • As these are views the information is not carried forward from a server restart
  • The information returned is cumulative so great care must be taken in evaluating the results.
  • To obtain consistent data and results we must store the information from the system views in permanent tables for later analysis.
  • The presented queries assume no partitioning

 

1.0  Operational cost on indexes ( assuming no RFI )

 

Operation

Read

Write

Select

Always

No

Insert

No

Always – every index

Update

Always

Only if row qualifies

Delete

Always

Only if row qualifies

 

 

2.0  Index Information

 

Object

Description

Notes

dbo.sysindexes

This is the SQL 2000 system table taken forward as a view.

This table may not be supported going forward.

 

sys.dm_db_index_physical_stats

This is actually a function and replaces dbcc showcontig

 

sys.indexes

2005 system table contains property information only for each index.

 

sys.dm_db_index_usage_stats

System view which records the access usage of every index in the database

We can view the number of times this index has been used, the type of access and the last access date

sys.dm_db_index_operational_stats

System function which records the operational cost of access to the index

We can view information such as the number of pages, locks, latches and waits.

sys.objects

Contains information on database objects

 

 

sys.dm_db_missing_index_details

System view which stores information on indexes the optimiser considers are missing.

 

sys.dm_db_missing_index_group_stats

System view which stores usage and access details for the missing indexes similar to sys.dm_db_index_usage_stats

We can also view the improvement that the index is computed to make, this is essentially the same type of output as shown by the Tuning Advisor.

sys.dm_db_missing_index_groups

Presumably this view will be of more use in SQL2008

 

sys.dm_db_missing_index_columns

A system function which returns the columns for a missing index

Requires the index handle as a parameter

3.0  sys.dm_db_index_usage_stats information

http://msdn2.microsoft.com/en-us/library/ms188755.aspx

 

  • There are groups of data columns which interest us
    • click on the link ( above ) for full table details
  • User access counts
  • User access dates
  • System access counts ( e.g. update statistics, index rebuilds )
  • System access dates

 

4.0  List unused tables/indexes

  • This will largely indicate empty tables as maintenance tasks will create entries within sys.dm_db_index_usage_stats where there is data in the table/index
    • Update statistics tablename
    • Dbcc dbreindex(tablename)
  • I’d normally full qualify queries with three part naming; this is omitted for this document.

 

--

-- run in database to analyse

--

Select object_name(i.object_id) as TableName,isnull(i.name,'HEAP') as IndexName, i.index_id

from sys.indexes i join sys.objects o on o.object_id = i.object_id

where objectproperty(o.object_id,'IsUserTable') = 1

and i.index_id NOT IN

(select s.index_id from sys.dm_db_index_usage_stats s

where s.object_id=i.object_id and i.index_id=s.index_id

and database_id = db_id() )

order by object_name(i.object_id),i.index_id;

 

 

 

5.0  Query to filter by user access

  • This sorts by lack of user access
  • This is a count of the times there was a seek, scan or lookup on the index/table
  • I’ve discounted updates from my criteria as an update will occur regardless of the index being used for selects
  • Essentially any index with zero as a value for seek, scan and lookup will be unused.

 

 

--

-- run in database to be analysed

--

select object_name(s.object_id) as TableName,isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end,

user_seeks as Seeks, user_scans as Scans, user_lookups as Lookups, user_updates as Updates

from sys.dm_db_index_usage_stats s join sys.indexes i on i.object_id = s.object_id

and i.index_id = s.index_id

where database_id = db_id() and objectproperty(s.object_id,'IsUserTable') = 1

order by (user_seeks + user_scans + user_lookups ) asc;

 

  • Remember that these values are cumulative since start up so we must snapshot the data on a regular basis to enable proper analysis
  • We must also collect data over a suitable time period based upon the usages of your database.
  • In an environment which runs, say, month end processes, it is vital that we capture the full work life cycle.
  • This is especially important if the CEO or MD has a favourite report they run once a month and you remove the index(es) which make this run quickly.

 

6.0  Capture snapshot of index activity

 

--

-- Run this is the database to be analysed

--

Insert into dbo.Unused_indexes

SELECT getdate(),* FROM sys.dm_db_index_usage_stats

WHERE database_id = DB_ID()

and (last_user_seek < convert(char(11),getdate()-1) or last_user_seek is null)

and (last_user_scan < convert(char(11),getdate()-1) or last_user_scan is null);

 

 

  • I would schedule this to run every night just after midnight
  • I’m collecting information where an index has not been used in the preceding 24 hours for a user seek or user scan. ( see  Working with snapshots  for explanation of this )
  • The collection date is added for clarity.
  • A primary key could be defined on the columns TheDate,database_id,object_id,index_id
  • The table is a Heap

 

6.1  Table to capture index activity

 

CREATE TABLE dbo.Unused_Indexes

(

            TheDate datetime NOT NULL,

            database_id smallint NOT NULL,

            [object_id] int NOT NULL,

            index_id int NOT NULL,

            user_seeks bigint NOT NULL,

            user_scans bigint NOT NULL,

            user_lookups bigint NOT NULL,

            user_updates bigint NOT NULL,

            last_user_seek datetime NULL,

            last_user_scan datetime NULL,

            last_user_lookup datetime NULL,

            last_user_update datetime NULL,

            system_seeks bigint NOT NULL,

            system_scans bigint NOT NULL,

            system_lookups bigint NOT NULL,

            system_updates bigint NOT NULL,

            last_system_seek datetime NULL,

            last_system_scan datetime NULL,

            last_system_lookup datetime NULL,

            last_system_update datetime NULL

)

 

 

 

 

7.0  sys.dm_db_index_operational_stats

http://msdn2.microsoft.com/en-us/library/ms174281.aspx

 

  • There are groups of data columns which interest us
    • click on the link ( above ) for full table details
  • This is a very important table which contains a wealth of information
  • We can view waits, page splits, and volume of access at leaf and non-leaf levels

 

8.0  View index usage by work - Selects

  • This query shows all indexes, reads to top , writes to bottom

 

--

-- Run this is the database to be analysed

--

select object_name(s.object_id) as TableName, isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end as IndexType

, reads=range_scan_count + singleton_lookup_count

, 'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count

, 'leaf_page_splits' = leaf_allocation_count

, 'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count + nonleaf_delete_count

, 'nonleaf_page_splits' = nonleaf_allocation_count

from sys.dm_db_index_operational_stats (db_id(),NULL,NULL,NULL) s join sys.indexes i

on i.object_id = s.object_id and i.index_id = s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1

order by reads desc, leaf_writes, nonleaf_writes;

 

 

8.1  This query only shows those indexes which have reads

 

--

-- Run this is the database to be analysed

--

select object_name(s.object_id) as TableName, isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end as IndexType

, reads=range_scan_count + singleton_lookup_count

, 'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count

, 'leaf_page_splits' = leaf_allocation_count

, 'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count + nonleaf_delete_count

, 'nonleaf_page_splits' = nonleaf_allocation_count

from sys.dm_db_index_operational_stats (db_id(),NULL,NULL,NULL) s join sys.indexes i

on i.object_id = s.object_id and i.index_id = s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1

and (range_scan_count + singleton_lookup_count)>0

order by reads desc, leaf_writes, nonleaf_writes;

 

 

 

 

8.2  Indexes without reads but with high writes may be being maintained unnecessarily

 

--

-- Run this is the database to be analysed

--

select object_name(s.object_id) as TableName, isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end as IndexType

, reads=range_scan_count + singleton_lookup_count

, 'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count

, 'leaf_page_splits' = leaf_allocation_count

, 'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count + nonleaf_delete_count

, 'nonleaf_page_splits' = nonleaf_allocation_count

from sys.dm_db_index_operational_stats (db_id(),NULL,NULL,NULL) s join sys.indexes i

on i.object_id = s.object_id and i.index_id = s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1

and (range_scan_count + singleton_lookup_count)=0

and (leaf_update_count+ leaf_delete_count+ nonleaf_update_count + nonleaf_delete_count)>0

order by leaf_writes desc, nonleaf_writes desc;

 

9.0  Order by writes to find your most heavily updated index

  • The inserts have been separated as we cannot avoid this io
  • Indexes without writes are excluded
  • Secondary indexes appearing high on the list may benefit from being placed on a separate filegroup
  • High levels of leaf writes on a HEAP may indicate the possibility of fragmentation which cannot be removed

 

--

-- Run this is the database to be analysed

--

select object_name(s.object_id) as TableName, isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end as IndexType

,'total_writes'=leaf_update_count+ leaf_delete_count+nonleaf_update_count + nonleaf_delete_count

,'total_insert_writes'=leaf_insert_count+nonleaf_insert_count

, 'leaf_writes'=leaf_update_count+ leaf_delete_count

, 'nonleaf_writes'=nonleaf_update_count + nonleaf_delete_count

, 'insert_leaf_writes'=leaf_insert_count

, 'insert_nonleaf_writes'=nonleaf_insert_count

, 'leaf_page_splits' = leaf_allocation_count

, 'nonleaf_page_splits' = nonleaf_allocation_count

, reads=range_scan_count + singleton_lookup_count

from sys.dm_db_index_operational_stats (db_id(),NULL,NULL,NULL) s join sys.indexes i

on i.object_id = s.