Show me some Cardinality

In the word of data – Cardinality issues is one issue you will already have encountered or will do @ some stage, having statistical information up to date is vital for the Query optimizer choosing a good query plan

So @ times you may have to explore the Tuple Cardinality & Table Cardinality values

In the screen shot below the top set of values is after a Statistics update has been run, set below was the previous

UPDATE STATISTICS schemaname.tablename WITH FULLSCAN, INDEX

·          The TUPLE_CARDINALITY = TABLE_CARDINALITY for 1st  row  {Note: This is a Clustered Primary Key index}

·          2nd - 4th records are for another non-clustered index which has 2 columns

o    2nd & 3rd rows represent these 2 columns (Notice the ORDINAL_POSITION)

o    4th is the non-clustered internal CI key

·          TUPLE_CARDINALITY values had considerable different values before/after the statistics update

o    2nd row has TUPLE_CARDINALITY = 33

§   Matches the number of distinct values for leading key column for nc index

o    3rd rows has TUPLE_CARDINALITY = 1058

§   Matches the number of distinct values for both columns in the nc index

o    4th rows, TUPLE_CARDINALITY  = 30496916

§   Previously was 229923 that’s just 0.75% of the table rows  ** That was my Red flag **

§   After update it now Matches the number of distinct values for CI key

So clearly something was not as accurate as it could be and the clustered & non-clustered indexes tuples seemed to be a long way from the values I expected them to be

The following code was used to capture these values

DROP TABLE #STATS

GO

CREATE TABLE #STATS

(

 TABLE_CATALOG       VARCHAR(128)

,TABLE_SCHEMA        VARCHAR(128)

,TABLE_NAME          VARCHAR(128)

,STATISTICS_CATALOG  VARCHAR(128)

,STATISTICS_SCHEMA   VARCHAR(128)

,STATISTICS_NAME     VARCHAR(128)

,STATISTICS_TYPE     INT

,COLUMN_NAME         VARCHAR(128)

,COLUMN_GUID         VARCHAR(128)

,COLUMN_PROPID       VARCHAR(128)

,ORDINAL_POSITION    INT

,SAMPLE_PCT          INT

,LAST_UPDATE_TIME    DATETIME

,NO_OF_RANGES        INT

,COLUMN_CARDINALITY  INT

,TUPLE_CARDINALITY   INT

,TABLE_CARDINALITY   INT

,AVG_COLUMN_LENGTH   INT

)

GO

INSERT #STATS EXEC ('DBCC SHOW_STATISTICS (''<dbname>.<schemaname>.<tablename>'',''<clusteredindexname>'') WITH STAT_HEADER JOIN DENSITY_VECTOR');

INSERT #STATS EXEC ('DBCC SHOW_STATISTICS (''<dbname>.<schemaname>.<tablename>'',''<nonclusteredindexname>'') WITH STAT_HEADER JOIN DENSITY_VECTOR');

Just replace the values to suit your requirments

<dbname>.<schemaname>.<tablename>

<clusteredindexname>

<nonclusteredindexname>

I have used the DBCC SHOW_STATISTICS WITH STAT_HEADER JOIN DENSITY_VECTOR other options are avialable, however these would not have given me the information I wanted

  • WITH STAT_HEADER
  • WITH DENSITY_VECTOR
  • WITH HISTOGRAM

I would look to evaluate the accuracy of the statistics on a table as part of any performance troubleshooting, if values are not as expected this may be an area to investigate further

This technique could also be used to gather statistics metrics easily @ some regular intervals for baselining purposes

Published 14 January 2012 21:10 by NeilHambly
Filed under:

Comments

No Comments