In SQL 2000 I wanted to find unused
indexes but found it to be difficult. I had to capture a huge trace that fully represents
the workload and “trust” that the ITW knows what it’s talking about. You could also
use the scan started trace event filtered by dbid, objectid, indexid. This procedure
was very tedious.
In SS2005, a quick query of the
dynamic management views lets you know which indexes are not being used. You can
do this in a few minutes what had previously taken days or weeks. This alone is
a very powerful feature.
I am proposing that you take it
a step further. We are going to use a little logic and the missing index DMV’s to
combine indexes and remove indexes that are still used but redundant. The steps
in this process would look like this:
1. Remove unused indexes with the unused
index script
2. Get your list of tables to analyze.
3. Remove redundant but used indexes.
4. Revue missing index DMV’s for mistaken
index drops.
5. Combine indexes that where it is
logical to do so.
6. Revue missing index DMV’s for mistaken
index drops.
This methodology is most effective
and viable when these conditions are met:
·
The server has been online and thus collecting stats for a long time.
·
The server is not pushing a hardware bottleneck. If so, this should be done during
maint window.
·
The server is enterprise edition and the tables allow online operations (i.e. no
LOB data or partitions).
What are the benefits of removing
unused indexes?
·
Reduced writes during updates
·
Reduced space usage
·
Reduced backup\restore space and time
·
Reduced index maintenance time
What are the benefits of removing
unused indexes, used but redundant indexes and combining indexes?
·
Less memory footprint for the same amount of data
·
A different angle to index tuning
·
Indexes are more likely to be covering
·
Reduced writes during updates
·
Reduced space usage
·
Reduced backup\restore space and time
·
Reduced index maintenance time
Now, let’s get down to business…
Remove Unused Indexes
This is pretty straightforward and
relatively safe. You want to make sure that your server has been up long enough
to get good index usage. This includes infrequent operations like month end reporting,
etc.
--Unused indexes
declare @dbid
int
select @dbid
= db_id()
select
object_name(s.object_id)
as ObjName
, i.name
as IndName
, i.index_id
, user_seeks + user_scans + user_lookups
as reads
, user_updates as writes
, sum(p.rows)
as rows
from
sys.dm_db_index_usage_stats s join sys.indexes i on
s.object_id
= i.object_id and
i.index_id = s.index_id
join
sys.partitions p on s.object_id =
p.object_id
and p.index_id= s.index_id
where
objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id> 0 and s.database_id = @dbid
group
by object_name(s.object_id),
i.name, i.index_id, user_seeks+ user_scans +
user_lookups, user_updates
order
by reads, writes desc
You can most likely drop any indexes
with zero or close to zero reads. The more rows, the more space you will reclaim.
The more writes, the better write performance you get by removing them. If the index
is not being read but writes are minimal and rows are low, there is little benefit
of dropping it. Keep in mind that not having an index that you need is a lot worse
than having an index you don’t need in most situations. I suggest starting off with
a conservative approach.
Now that we have gotten rid of the
low hanging fruit, we can get deeper. I get a list of tables by size and work my
way down.
select
object_name(object_id), max(rows)
from
sys.partitions
group
by object_name(object_id)
order by 2
desc
After running that query we see
that tEvent is the largest table and should provide nice gains. This is where we
start. Let’s use this schema for our example.
create
table tEvent
(EventID int
primary key clustered,
EventType int,
EventName varchar(100),
EventDetailID int,
CustomerID int,
CompanyID int,
DateOpen datetime,
DateClose datetime)
create
index ind1 on tEvents(EventDetailID);
create
index ind2 on tEvents(CustomerID);
create
index ind3 on tEvents(EventDetailID, CustomerID,
CompanyID);
create
index ind4 on tEvents(CompanyID, DateClose,
EventType);
create index ind5
on tEvents(CustomerID,
DateClose, EventType);
Removing used but redundant indexes
If we do a sp_helpindex, we can
see that ind3 should satisfy queries currently using ind1 and ind2. The keyword
is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor
the missing index DMV to see if there is any negative impact. We should also see
the read count substantially increase by using the unused index query filtered by
tEvent. Here is the missing index monitoring query:
--Missing indexes
SELECT sys.objects.name
, (avg_total_user_cost
* avg_user_impact)
* (user_seeks +
user_scans) as
Impact
, mid.equality_columns
, mid.inequality_columns
, mid.included_columns
FROM
sys.dm_db_missing_index_group_stats AS
migs
INNER
JOIN sys.dm_db_missing_index_groups
AS mig ON migs.group_handle =
mig.index_group_handle
INNER
JOIN sys.dm_db_missing_index_details
AS mid ON mig.index_handle =
mid.index_handle
INNER
JOIN sys.objects
WITH (nolock) ON mid.object_id =
sys.objects.object_id
WHERE
(migs.group_handleIN
(
SELECT
TOP
(5000) group_handle
FROM
sys.dm_db_missing_index_group_stats WITH
(nolock)
ORDER
BY (avg_total_user_cost
* avg_user_impact)
* (user_seeks +
user_scans) DESC)
)
and
objectproperty(sys.objects.object_id,
'isusertable')=1and name = 'tEvent'
ORDER BY 2
DESC , 3 desc
As long as the indexes do not come
back up in this report, we should be ok. Now we are servicing the same queries with
2 less indexes. Those are index pages that are no longer taking up buffer pool space!
Combining indexes
The same concept can be applied
to the following scenario but a little more “feel” and understanding of how the
app accesses the data is needed.
create
index ind4 on tEvents(CompanyID, DateClose,
EventType);
create index ind5
on tEvents(CustomerID,
DateClose, EventType);
We know a former DBA added these.
It is logical for us, based on our knowledge of the app and the cardinality of the
data, to try to replace these indexes with this one.
create index ind6
on tEvents(CustomerID,
CompanyID,
DateClose, EventType)
with (online=on,
maxdop=8);
This index should satisfy all queries
using both indexes. Again, we want go back to the missing index report to see if
SQL thinks it needs one of those indexes.
Now work your way down your list
of tables by row count. I was able to reduce the size of a 200GB database by 15%
in addition to removing the unused indexes. That’s a big gain! The only problem
I ran into was changing the name of an index that had a hint. There were a few indexes
that I had to add back but since my approach was conservative it was nothing drastic
and completely online.
Let me know if you have any other
tips or questions regarding this topic.
* Cross posted from http://statisticsio.com