in

SQL Server Community Blogs

Voices of the SQL Server Community

Jason Massie's SQL blog

Take it a step further with the index DMV's

In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.

In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.

I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:

1.       Remove unused indexes with the unused index script

2.       Get your list of tables to analyze.

3.       Remove redundant but used indexes.

4.       Revue missing index DMV’s for mistaken index drops.

5.       Combine indexes that where it is logical to do so.

6.       Revue missing index DMV’s for mistaken index drops.

This methodology is most effective and viable when these conditions are met:

·         The server has been online and thus collecting stats for a long time.

·         The server is not pushing a hardware bottleneck. If so, this should be done during maint window.

·         The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).

What are the benefits of removing unused indexes?

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?

·         Less memory footprint for the same amount of data

·         A different angle to index tuning

·         Indexes are more likely to be covering

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

Now, let’s get down to business…

 

Remove Unused Indexes

This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.

--Unused indexes

declare @dbid int 

select @dbid = db_id() 

 

select object_name(s.object_id) as ObjName

, i.name as IndName

, i.index_id 

, user_seeks + user_scans + user_lookups  as reads

, user_updates as writes 

, sum(p.rows) as rows

from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id  and i.index_id = s.index_id 

join sys.partitions p on s.object_id = p.object_id and p.index_id= s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id> 0 and s.database_id = @dbid 

group by object_name(s.object_id), i.name, i.index_id, user_seeks+ user_scans + user_lookups, user_updates

order by reads, writes desc 

 

You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.

Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.

select object_name(object_id), max(rows)

from sys.partitions

group by object_name(object_id)

order by 2 desc

After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.

create table tEvent

(EventID int primary key clustered,

EventType int,

EventName varchar(100),

EventDetailID int,

CustomerID int,

CompanyID int,

DateOpen datetime,

DateClose datetime)

 

create index ind1 on tEvents(EventDetailID);

 

create index ind2 on tEvents(CustomerID);

 

create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);

 

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

Removing used but redundant indexes

If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:

--Missing indexes

SELECT sys.objects.name

, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact

, mid.equality_columns

, mid.inequality_columns

, mid.included_columns

FROM sys.dm_db_missing_index_group_stats AS migs

INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle

INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id

WHERE     (migs.group_handleIN

(

SELECT     TOP (5000) group_handle

FROM sys.dm_db_missing_index_group_stats WITH (nolock)

ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)

)

and objectproperty(sys.objects.object_id, 'isusertable')=1and name = 'tEvent'

ORDER BY 2 DESC , 3 desc

As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!

Combining indexes

The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.

create index ind6 on tEvents(CustomerID, CompanyID,  DateClose, EventType) with (online=on, maxdop=8);

This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.

Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.

Let me know if you have any other tips or questions regarding this topic.

* Cross posted from http://statisticsio.com 

Comments

No Comments

About JasonMassie

Jason is a SQL Server Consultant for the professional services organization of Terremark (Formerly Data Return LLC) where he has worked for the last 8 1/2 years. Jason is an MCITP Database Administrator\Database Developer as well as an MCDBA on 7.0 and 2000. You can read his blog at http://statisticsio.com. He lives in Irving, TX. He enjoys time with his wife and three daughters as well as making electronic music. He can be reached at http://linkedin.com/in/jasonmassie or jmassie@terremark.com
Powered by Community Server (Commercial Edition), by Telligent Systems