Given the choice 8 out of 10 Optimisers prefer.........

Did you know that included columns do not partake in the uniqueness of a unique index? ( see below )

A few months ago we upgraded our major production system from SQL2000 to SQL2008, this has allowed me to apply some of the index tuning techniques I devised for SQL2005 way back when to the current environment now we're confident we have no unexpected surprises to surface.

Amongst the techniques I use is to pull information from the dmvs to find tables ( and indexes ) which are getting high numbers of scans. The main aim is to then seek out effetcive indexes to turn the scans to seeks, this will help reduce io for selects and hopefully reduce contention where shared locks may hinder data changes. I also view the number of updates as this helps me find those tables which don't suffer high volumes of changes, tables which rarely change can support more aggressive indexing to improve data extraction.

see my previous post

I should point out that this is just one way I approach tuning, reducing lookups ( bookmark lookups in old language ) and scans ( table / clustered index scans ) has to give positive benefits.

Anyway amongst the highest scans with no seeks was a very narrow table, an identity plus two integers, this table was showing approx 2,000 scans per minute on the clustered index. Essentially this table is used to link data, it doesn't get updated very often.
I took a judgement on what secondary index I thought would be required and created an index on one column, let's call it CustomerID and included the other column, we'll call this LocationID, the combination of the two columns is unique, although there are no constraints or indexes to enforce this.

After applying the index I watch how the indexes are being used, in this instance the scans on the table stopped and the new index showed seeks. Now I should point out that this technique can only be applied to a live database I am index tuning not query tuning, I have no idea where the table is used in the application, there are just under 2,000 stored procedures and this table is used in 7 of them. This isn't about poor execution, high io or contention this is 100% about how the table is being accessed.
Experience has taught me that sometimes the only way to resolve performance issues is to work in the live database, it's not something I prefer but how many times have you taken a poorly performing query from production to test/dev only to discover it runs perfectly ? I've even restored a backup to point in time in an attempt to reproduce a badly performing query, it's never worked btw.

I digress, but thought I should make it clear what I'm doing and why. Having created what appeared to be the ideal index idle curiosity prompted me to create another index but with the columns reversed, LocationID include CustomerID. To my surprise this index was quickly used too, along with the original secondary index, in all this time there has been no access to the base clustered index/table at all.

At this stage I investigated if I could make the index unique, OK so maybe it just had never ocurred to me before but columns which are included do not partake in the uniqueness, so my unique index creation failed.
Solution, drop the included columns and make indexes on both columns.

Now I have 4 indexes on the two columns,  CustomerID+LocationID,  LocationID+CustomerID,  CustomerID include LocationID,   LocationID include CustomerID   plus the original clustered index on the identity column.

Which index(s) get used ? Well the included indexes are ignored and the optimiser now uses only the two conventional indexes. What is interesting in all of this is that given any one of the four indexes on their own the optimiser uses that index and never touches the base table, where there is a conventional index which has the same column order as an included index the included index is ignored.

From my view I've removed another table which was only being scanned, however the table is small so I suspect that for a much larger table things might turn out different. Originally there were close to 100 tables out of 600 odd which were showing scans and/or lookups without seeks, I've cut this to under 80 so far, generally I have to find all the stored procedures which use a table and step though every query, many of the procedures are 1,000s of lines and very complex so a slow task. The complexity of some parts of the system make conventional query analysis almost impossible, but that's another story on how I approach that.


Published Friday, June 4, 2010 12:04 PM by GrumpyOldDBA


No Comments