Analysing Indexes Part 2

Here's the next instalment, expect there to at least another two or three.

 

  • I’m still only working with basic indexing
  • No partitioning
  • I’m not looking at xml indexes.

 

14.0      Collecting Raw Index usage data

  • Collects table name, index name , reads and writes where there is a value for reads or writes
  • Collect daily / hourly or as required
    • The greater the granularity the greater volume of data collected
  • Values stored in the dmv’s are cumulative so we have to store snapshots and query for weekly/daily/hourly differences to extract usage figures.

 

14.1  Table to store data – create in a monitoring database.

  • It’s best to create a dedicated database to collect data, most DBA’s have their own database
  • For the examples here we’ll assume it’s called ServerAdmin

 

 --

-- create in ServerAdmin database

--

Use ServerAdmin;

go

IF  EXISTS (SELECT * FROM sys.objects WHERE objectproperty(OBJECT_ID(N'[dbo].[tbl_IndexActivity]'),'IsUserTable') = 1)

DROP TABLE [dbo].[tbl_IndexActivity];

--

create table dbo.tbl_IndexActivity

(

TableName sysname not null,

IndexName sysname not null,

IndexType varchar(4) not null,

Reads bigint not null,

Writes bigint not null,

TheDate datetime  constraint DF_IndexActivity_TheDate default getdate() not null

);

 

14.2  Run this query in a scheduled job

  • I don’t actually create a stored procedure here as I would have to create it within the monitored database, something I might not want to do.

 

--

-- Run this is the context of the database to be analysed

-- every night for daily stats

--

insert into ServerAdmin.dbo.tbl_IndexActivity(TableName,IndexName,IndexType,Reads,Writes)

select object_name(s.object_id) as TableName, isnull(i.name,'HEAP') as IndexName,

case i.index_id

when 0 then 'HEAP'

when 1 then 'CLUS'

else 'NC'

end as IndexType

,reads=user_seeks + user_scans + user_lookups

,writes =  user_updates

from sys.dm_db_index_usage_stats s join sys.indexes i

on s.object_id = i.object_id and i.index_id = s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1

and s.database_id = db_id()

order by reads desc;

go

  • This will create lots of data as it will create a row for each index which has been accessed
  • Remember that dmv’s contain cumulative data so we may find some indexes have not been used recently.
  • Once we have sets of data we can view the difference to see what’s happening
  • Although I’ve suggested this collection to run daily the interval could be any suitable value down to a couple of minutes for real time stats.
  • I’m using this basic data collection to decide which indexes I want to analyse in greater detail.

 

14.3      Most Writes for an hour period

  • Example assumes collection of data hourly
  • Modify the SARGs for ia1.thedate and ia2.thedate to return different data sets.
  • You may wish to to collect top 25 results in each category and store in another table
  • This data would make a good OLAP cube

 

--

-- Example query to extract index usage

-- Results for the hour 10:00 to 11:00 today

-- Highest Writes

--

select  ia1.tablename,ia1.indexname,ia2.reads-ia1.reads as Reads,ia2.writes-ia1.writes as Writes

from ServerAdmin.dbo.tbl_indexactivity ia1 join ServerAdmin.dbo.tbl_indexactivity ia2 on

ia1.tablename = ia2.tablename and ia1.indexname=ia2.indexname

where datepart(hh,ia1.thedate) = 10  and convert(char(8),ia1.thedate,112) = convert(char(8),getdate(),112)

and  datepart(hh,ia2.thedate) = 11 and convert(char(8),ia2.thedate,112) = convert(char(8),getdate(),112)

and ( (ia2.reads-ia1.reads) + (ia2.writes-ia1.writes) ) >0

order by (ia2.writes-ia1.writes) desc;

go

 

          Abridged results     

·          Table names and index names are fictitious

 

tablename

indexname

Reads

Writes

TransportLinks

PK_TransportLinks

153575

188427

EuropeSites

Idx_EuropeSites_Manager

42112

57345

RegionalManager

Uk_RegionalManager_Name

49716

50318

LocalAreaSales

Idx_LocalAreaSales_One

0

41580

LocalAreaSales

Idx_LocalAreaSales_Two

0

41580

 

14.4      Most Reads for an hour period

  • Assumes collection of data hourly

 

 --

-- Example query to extract index usage

-- Results for the hour 10:00 to 11:00 today

-- Highest Reads

--

select  ia1.tablename,ia1.indexname,ia2.reads-ia1.reads as Reads,ia2.writes-ia1.writes as Writes

from ServerAdmin.dbo.tbl_indexactivity ia1 join ServerAdmin.dbo.tbl_indexactivity ia2 on

ia1.tablename = ia2.tablename and ia1.indexname=ia2.indexname

where datepart(hh,ia1.thedate) = 10  and convert(char(8),ia1.thedate,112) = convert(char(8),getdate(),112)

and  datepart(hh,ia2.thedate) = 11 and convert(char(8),ia2.thedate,112) = convert(char(8),getdate(),112)

and ( (ia2.reads-ia1.reads) + (ia2.writes-ia1.writes) ) >0

order by (ia2.reads-ia1.reads) desc;

go

 

 

          Abridged results     

 

tablename

indexname

Reads

Writes

SalesItems

Uk_SalesItems_StockNumber

264492

4

SalesItems

Idx_SalesItems_Reject

257247

0

EmployeeGroup

PK_EmployeeGroup

212268

11

TransportLinks

PK_TransportLinks

153575

188427

RestrictedItems

Idx_RestrictedItems_Analysis

76818

38608

 

14.5  Most Reads and Writes for an hour period

 

 --

-- Example query to extract index usage

-- Results for the hour 10:00 to 11:00 today

-- Highest Reads and Writes

--

select  ia1.tablename,ia1.indexname,ia2.reads-ia1.reads as Reads,ia2.writes-ia1.writes as Writes

from ServerAdmin.dbo.tbl_indexactivity ia1 join ServerAdmin.dbo.tbl_indexactivity ia2 on

ia1.tablename = ia2.tablename and ia1.indexname=ia2.indexname

where datepart(hh,ia1.thedate) = 10  and convert(char(8),ia1.thedate,112) = convert(char(8),getdate(),112)

and  datepart(hh,ia2.thedate) = 11 and convert(char(8),ia2.thedate,112) = convert(char(8),getdate(),112)

and ( (ia2.reads-ia1.reads) + (ia2.writes-ia1.writes) ) >0

order by ((ia2.reads-ia1.reads)+(ia2.writes-ia1.writes)) desc;

go

 

 

          Abridged results     

 

tablename

indexname

Reads

Writes

TransportLinks

PK_TransportLinks

153575

188427

SalesItems

Uk_SalesItems_StockNumber

264492

4

SalesItems

Idx_SalesItems_Reject

257247

0

EmployeeGroup

PK_EmployeeGroup

212268

11

RestrictedItems

Idx_RestrictedItems_Analysis

76818

38608

NearestPub

Uk_NearestPub

49716

50318

 

14.6  Most Reads and Writes showing %age proportion for an hour period

 

 --

-- Example query to extract index usage

-- Results for the hour 10:00 to 11:00 today

-- Highest Reads and Writes with proportion shown

--

select  ia1.tablename,ia1.indexname,ia2.reads-ia1.reads as Reads,ia2.writes-ia1.writes as Writes

, convert(int, (ia2.reads-ia1.reads)*1.0/((ia2.reads-ia1.reads)+(ia2.writes-ia1.writes))*100.0) as 'Read%'

, convert(int, (ia2.writes-ia1.writes)*1.0/((ia2.reads-ia1.reads)+(ia2.writes-ia1.writes))*100.0) as 'Write%'

from ServerAdmin.dbo.tbl_indexactivity ia1 join ServerAdmin.dbo.tbl_indexactivity ia2 on

ia1.tablename = ia2.tablename and ia1.indexname=ia2.indexname

where datepart(hh,ia1.thedate) = 10  and convert(char(8),ia1.thedate,112) = convert(char(8),getdate(),112)

and  datepart(hh,ia2.thedate) = 11 and convert(char(8),ia2.thedate,112) = convert(char(8),getdate(),112)

and ( (ia2.reads-ia1.reads) + (ia2.writes-ia1.writes) ) >100

order by ((ia2.reads-ia1.reads)+(ia2.writes-ia1.writes)) desc;

go

 

 

          Abridged results     

 

tablename

indexname

Reads

Writes

Read%

Write%

TransportLinks

PK_TransportLinks

153575

188427

44

55

SalesItems

Uk_SalesItems_StockNumber

264492

4

99

0

SalesItems

Idx_SalesItems_Reject

257247

0

100

0

EmployeeGroup

PK_EmployeeGroup

212268

11

99

0

RestrictedItems

Idx_RestrictedItems_Analysis

76818

38608

66

33

NearestPub

Uk_NearestPub

49716

50318

49

50

 

  • So this allows us to view index usage at the granularity of data collection.
  • For the application I’m supporting data collection grows at 1,700 rows per hour, that’s about 1.3 million rows per month
  • This data will allow us to collect more detailed information from the dmv’s to fully analyse index usage
  • I’d probably suggest selecting the top 100 indexes for each category for a 7 day period and using this to extract more detailed statistics. ( This is in Part 3 )

 

15.0  So Just how many indexes do I have ?

 

--

-- How many indexes ?

-- run in context of database to where they are to be counted

--

select type_desc,count(*) from sys.indexes

where objectproperty(OBJECT_ID,'IsUserTable') = 1

group by type_desc;

go

 

 

type_desc

(No column name)

CLUSTERED

2936

HEAP

17

NONCLUSTERED

4067

 

15.1  How many empty tables do I have ?      

  • Many third party applications have unused sections which may result in many empty tables
  • Having a definitive list of unused tables will help provide a filter to use in subsequent analysis
  • As far as I can see this method is accurate, however, row count is not so easily found in SQL 2005, it’s available within system functions which cannot be accessed normally.

 

15.2      Create this table in the ServerAdmin database

 

 USE [ServerAdmin];

GO

--

IF  EXISTS (SELECT * FROM sys.objects WHERE objectproperty(OBJECT_ID(N'[dbo].[tbl_EmptyTables]'),'IsUserTable') = 1)

DROP TABLE [dbo].[tbl_EmptyTables];

--

create table dbo.tbl_EmptyTables

(

TableName sysname not null,

[Object_id] bigint not null,

IndexName sysname not null,

Row_Count bigint not null

);

go

 

15.3  Run this script in the context of the database to be analysed

  • Extracts by clustered index and by HEAP
  • Easiest to use dbo.sysindexes although not ideal.

 

 --

-- run in database to be analysed

--

insert into ServerAdmin.dbo.tbl_EmptyTables(TableName,[Object_id],IndexName,Row_Count)

select object_name(id),id,name,rows from dbo.sysindexes

where rows = 0 and indid = 1

and objectproperty(ID,'IsUserTable') = 1;

--

insert into ServerAdmin.dbo.tbl_EmptyTables(TableName,[Object_id],IndexName,Row_Count)

select object_name(id),id,isnull(name,'HEAP'),rows from dbo.sysindexes

where rows = 0 and indid = 0

and objectproperty(ID,'IsUserTable') = 1;

go

 

 

15.4  Add this unique index to check there are no duplicate entries

 

 use ServerAdmin;

go

--

create unique index uk_tbl_EmptyTables_Object_ID on dbo.tbl_EmptyTables([object_id]);

go

 

  • This produces a result set of over 2,000 for the application I am supporting

 

16.0  How many statistics?

  • Now the viewing of statistics is somewhat different in SQL 2005
  • We don’t want to see stats on indexes
  • You can also keep a watch on how many “auto stats” your database has with this query
  • I generally consider high numbers of auto stats being indicative of inadequate indexing.
  • I generally will periodically delete all system stats from a database
  • If there were stats and you create an index the stats remain
  • The stats may have come from who knows when and may not be relevant any more
  • As always be careful, your database may take a temporary performance hit if it’s running on auto stats
  • The subject of statistics will be discussed later in this series

 

16.1  List all statistics

 

--

-- show statistics in the database

--

select object_name(s.object_id),* from sys.stats s

where objectproperty(s.OBJECT_ID,'IsUserTable') = 1

and s.auto_created | s.user_created = 1;

go

 

ü       The symbol in the AND statement ( | ) is SHIFT + Backslash ( left of Z on my keyboard )

 

Abridged results

 

(No column name)

Object

id

name

Stats

id

Auto

created

User

created

No

recompute

Employees

119887

sts_Employees1

2

0

1

0

Employees

119887

sts_Employees2

3

0

1

0

Sales

1187550

_WA_Sys_00000002_08B54D69

2

1

0

0

RegionSouth

1656045

sts_RegionSouth_Pole

3

0

1

0

RegionNorth

1656045

_WA_Sys_Rows_75D7831F

4

1

0

0

 

  • System generated statistics always start _WA_Sys_
  • There are also hypothetical indexes which you should never see, these start   hind_%

 

 

16.2  How many statistics do I have ?

 

--

-- How many statistics ?

-- run in contect of database to whose stats are to be counted

--

select

case convert(char(1),auto_created)+convert(char(1),user_created)

when '10' then 'Auto Created'

when '01' then 'User Created'

else 'Both'

end as StatsType,

count(*) as HowMany from sys.stats s

where objectproperty(s.OBJECT_ID,'IsUserTable') = 1

and s.auto_created | s.user_created = 1

group by convert(char(1),auto_created)+convert(char(1),user_created);

go

 

 

StatsType

HowMany

User Created

7786

Auto Created

121

 

15.3  View to display Statistics Information

  • It used to be quite simple to view statistics, as against indexes in SQL 2000
  • Not so easy in SQL 2005
  • This view which should be created in the user database that you wish to display basic information on statistics

 

--

-- view to provide information on statistics only

-- filters out index stats and system objects

-- per database create in database

--

IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[dm_db_statistics]'))

DROP VIEW [dbo].[dm_db_statistics];

go

create view dbo.dm_db_statistics

as

select object_name(s.[object_id]) as TableName,c.name as ColumnName,s.name as StatName,s.auto_created,s.user_created,s.no_recompute,s.[object_id],

s.stats_id,sc.stats_column_id,sc.column_id,stats_date(s.[object_id], s.stats_id) as LastUpdated

from sys.stats s join sys.stats_columns sc on sc.[object_id] = s.[object_id] and sc.stats_id = s.stats_id

join sys.columns c on c.[object_id] = sc.[object_id] and c.column_id = sc.column_id

where objectproperty(s.OBJECT_ID,'IsUserTable') = 1

and s.auto_created | s.user_created = 1;

go

 

 

Sample results ( please excuse formatting )

 

Table Name

Column Name

Stat Name

Auto created

User created

No recompute

Object id

Stats id

Stats column id

Column id

Last Updated

IndexStats

LastUpdate

_WA_Sys_ IndexStats _164452B1

1

0

0

373576369

2

1

10

2007-07-16 16:34:21.780

WeeklyStats

counter_name

_WA_Sys_counter_name_173876EA

1

0

0

389576426

2

1

3

2007-07-16 16:34:21.827

Product2007

header

_WA_Sys_header_24927208

1

0

0

613577224

2

1

1

2007-07-16 16:34:21.890

Trace20070707

TextData

_WA_Sys_00000003_2D27B809

1

0

0

757577737

2

1

3

2007-07-10 14:15:07.863

 

 

  • When there are many indexes and statistics the results set can be large and make analysis difficult
  • The application I support has over 15,000 indexes and statistics, many of these relate to unpopulated tables.
  • Should the system procedure  sp_createstats  have been run against the database then statistics will have been created on empty tables.
  • This can be a useful procedure in as much as you can command it to create statistics on columns of multi column or compound indexes
  • However, great as this may sound, if you have a large number of covered indexes or precise compound indexes you might not actually benefit from these extra statistics – and these will have to be maintained – so use this proc with care.

 

http://msdn2.microsoft.com/en-us/library/ms186834.aspx

 

( to be continued )

Published 22 July 2007 22:38 by GrumpyOldDBA

Comments

# re: Analysing Indexes Part 2

01 August 2007 13:03 by masri999

Hello,

I have visited several sites and several scripts . Most of the performance tuning scripts stops at collecting the information.

This is of little use. How to read the output value and what is the threshold limit of each value and what action need to be taken to correct the issue are missing . Usual answer for this 'it depndends'

Eventhough some of the values depend on server configuration (disk,cpu etc)  , It is better to post the information for this server hardware configuration , these are the values recorded and  this value is above the threshold like disk seek time etc . To overcome this need to take following steps with alternatives .

If you can post threshold values for windows performance counter for a given server configuration and load, it will be wonderful

Thanks

Srinivas

# re: Analysing Indexes Part 2

01 August 2007 20:45 by GrumpyOldDBA

I think you're missing the point - this is part two of a series of articles leading through some of the processes I use to analyse index performance. Primarily it's based around third party applications - the sort of thing the average DBA has to support , so the DBA has had no influence on the design or architecture of the application database. It is a task to take such a database apart, which will always be a critical production system, and make it run better. In a controlled environment one has to provide hard evidence to back changes to a database, especially if it happens to be a big application - the series of posts will cover this and how to decide what you should do.

As to your assertion there should be hard and fast values - sadly life's not like that, that's how I make a living.

# sys.dm_db_index_physical_stats

22 October 2007 22:30 by Grumpy Old DBA

It seems my comments in my SQLBits presentation about dropping sys.dm_db_index_physical_stats into a

# analyzing usage of indexes?! | keyongtech

Pingback from  analyzing usage of indexes?! | keyongtech

# Articles about _wa_sys_ volume 6 « Article Directory

Pingback from  Articles about _wa_sys_ volume 6 «  Article Directory