Creating Baselines for P & O

The most important part of performance tuning is to be able to establish a baseline. Without a baseline and on-going statistics and trending information you just cannot make informed judgements concerning performance degredation or improvement or on future growth.

I have three standard sets of data I collect as a minimum, here is the first.

There's plenty of articles around concerning the use of fn_virtualfilestats so I'm going to concentrate on how I use this function and the data it generates.

Here's a typical call and result set:-

select * from ::fn_virtualfilestats (11,-1)

DbId   FileId TimeStamp   NumberReads          NumberWrites         BytesRead            BytesWritten         IoStallMS            
------ ------ ----------- -------------------- -------------------- -------------------- -------------------- -------------------- 
11     1      179531484   58                   798                  2375680              7946240              389
11     2      179531484   8                    10160                337408               11038720             0

I collect data every hour using a scheduled job and store it in a table
Here's a typical subset of data

time              dbid fileid reads                writes               iostall              
----------------- ---- ------ -------------------- -------------------- -------------------- 
Nov  2 2006  3:00 11   1      407245               71259                1166859
Nov  2 2006  3:00 11   2      10874                327285               34415
Nov  2 2006  4:00 11   1      407253               71259                1166984
Nov  2 2006  4:00 11   2      10874                327285               34415
Nov  2 2006  5:00 11   1      407263               71259                1167046
Nov  2 2006  5:00 11   2      10874                327285               34415
Nov  2 2006  6:00 11   1      407263               71259                1167046
Nov  2 2006  6:00 11   2      10874                327285               34415
Nov  2 2006  7:00 11   1      407357               71259                1167358
Nov  2 2006  7:00 11   2      10874                327285               34415
Nov  2 2006  8:00 11   1      407477               71259                1167700
Nov  2 2006  8:00 11   2      10874                327285               34415
Nov  2 2006  9:00 11   1      407591               71259                1167965
Nov  2 2006  9:00 11   2      10874                327285               34415
Nov  2 2006 10:00 11   1      407614               71259                1168090

Trouble is that the numbers are cumlative and thus I have to do subtraction to obtain the figures for the hour.
Well in principle this sounds easy but in reality if you're collecting data from say, 14 databases every hour, the query to extract that difference and the hour in which it occurred is "complex"
So I take the easy way out and use calculated columns with functions doing the maths, this gives me reuslts similar to this.

time              dbid fileid reads                writes               iostall              hourreads            hourwrites           
----------------- ---- ------ -------------------- -------------------- -------------------- -------------------- -------------------- 
Oct 13 2006  2:28 11   1      101620751            27122598             846102670            NULL                 NULL
Oct 13 2006  2:28 11   2      3042305              70473357             432497               NULL                 NULL
Oct 13 2006  4:00 11   1      101699390            27143997             846966601            43345                11543
Oct 13 2006  4:00 11   2      3042559              70551758             432513               183                  52881
Oct 13 2006  5:00 11   1      101727442            27155013             847248234            28052                11016
Oct 13 2006  5:00 11   2      3042618              70588701             432513               59                   36943
Oct 13 2006  6:00 11   1      101760009            27164425             847612578            32567                9412
Oct 13 2006  6:00 11   2      3042668              70618819             432529               50                   30118
Oct 13 2006  7:00 11   1      101779677            27172467             847779295            19668                8042
Oct 13 2006  7:00 11   2      3042717              70660817             432529               49                   41998
Oct 13 2006  9:00 11   1      101824133            27187730             848222714            12645                7634
Oct 13 2006  9:00 11   2      3042801              70725796             432529               41                   31570

I have a job which publishes this data out to a web page every day and rolls up the previous weeks data into averages to give an overall view of i/o.

Points:-

  • I sample every hour, generally this is fine, for heavy diagnostic work I might actually sample more often, say 10 mins.
  • This type of data collection can help to pinpoint databases and times under load, if you have multiple files then you'll be able to see if the load on the files is balanced.
  • Conversely it can help finding the quiet times to assist in locating maintenence windows in 24 x 7 systems.
  • Once you have a few months of data the analysis can become more interesting.
  • Functions in tables generally don't give good performance, however it does make things more simple.
  • An index on column TheDate will aid query performance
  • Ideally the data would be better extracted to a cube.
  • Notwithstanding the performance hit you can do some really cool things with functions as calculated columns, including joins and/or lookups to other tables.
  • I wouldn't recommend this for high performance oltp databases but for reporting and such it is really very handy. I've been using this technique since sql 2000 was released.

To populate the table I have a job set to run hourly which would have a step such as this:-

--
insert into ServerAdmin.dbo.Filestats(DbID,FileID,StampTime,Reads,Writes,BytesRead,BytesWrite,ioStall)
select * from ::fn_virtualfilestats (7,-1)
--
insert into ServerAdmin.dbo.Filestats(DbID,FileID,StampTime,Reads,Writes,BytesRead,BytesWrite,ioStall)
select * from ::fn_virtualfilestats (10,-1)
--
insert into ServerAdmin.dbo.Filestats(DbID,FileID,StampTime,Reads,Writes,BytesRead,BytesWrite,ioStall)
select * from ::fn_virtualfilestats (11,-1)
--
insert into ServerAdmin.dbo.Filestats(DbID,FileID,StampTime,Reads,Writes,BytesRead,BytesWrite,ioStall)
select * from ::fn_virtualfilestats (2,-1)
--

I keep this simple, no generic or dynamic coding, note that I always collect stats for Tempdb.

Here's part of a typical report and the code for the tables and functions, the functions must be created prior to the table creation.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table:-

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[FileStats]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[FileStats]
GO

CREATE TABLE [dbo].[FileStats] (
	[NumKey] [int] IDENTITY (1, 1) NOT NULL ,
	[TheDate] [datetime] NULL ,
	[DbID] [tinyint] NULL ,
	[FileID] [tinyint] NULL ,
	[StampTime] [bigint] NULL ,
	[Reads] [bigint] NULL ,
	[Writes] [bigint] NULL ,
	[BytesRead] [bigint] NULL ,
	[BytesWrite] [bigint] NULL ,
	[ioStall] [bigint] NULL ,
	[DbName] AS (db_name([DbID])) ,
	[HourWrites] AS ([dbo].[fn_FileStats_Write]([TheDate], [dbid], [fileid], [Writes])) ,
	[HourReads] AS ([dbo].[fn_FileStats_Read]([TheDate], [dbid], [fileid], [Reads])) ,
	[HourIo] AS ([dbo].[fn_FileStats_IOStall]([TheDate], [dbid], [fileid], [iostall])) ,
	[AvIoStall] AS ([dbo].[fn_FileStats_AvIoStall2]([TheDate], [dbid], [fileid], [Reads], [Writes], [ioStall])) 
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[FileStats] ADD 
	CONSTRAINT [DF__FileStats__TheDate] DEFAULT (getdate()) FOR [TheDate],
	CONSTRAINT [uk_FileStats_Numkey] UNIQUE  NONCLUSTERED 
	(
		[NumKey]
	)  ON [PRIMARY] 
GO
--
--
--

Functions:- -- -- CREATE function [dbo].[fn_FileStats_Write] (@p1 smalldatetime,@p2 tinyint,@p3 tinyint,@p4 bigint) -- ============================================================= -- Function: fn_FileStats_Write -- Written by: Colin Leversuch-Roberts -- www.kelemconsulting.co.uk -- (c) 6th Jan 2004 -- -- Purpose: Populates a computed column in table FileStats -- -- System: DBA maintenance -- -- Input Paramters: @p1 smalldatetime, @p2 tinyint, @p3 tinyint, @p4 bigint -- row timestamp, database id, file id, number of writes -- -- Returns : bigint -- -- Usage: Computed column in table { select dbo.fn_FileStats_Write(@p1,@p2,@p3,@p4) } -- -- Notes: This function calculates the number of writes recorded against the file id -- for the last hour. The writes, reads and i/o stall figures are cumlative hence -- the calulations required to work out the figures -- -- VERSION HISTORY -- Version No Date Description -- 1 6 jan - 2004 Initial Release -- ============================================================= returns bigint as begin declare @Return bigint,@Writes bigint select @Writes=Writes from dbo.filestats with (nolock) where convert(char(15),theDate,113)+'00'=convert(char(15),dateadd(hh,-1,@p1),113)+'00' and dbid=@p2 and fileid=@p3 IF @p4=@Writes set @return=0 ELSE set @Return=abs(@p4-@Writes) --endif return(@Return) end -- -- -- CREATE function [dbo].[fn_FileStats_Read] (@p1 smalldatetime,@p2 tinyint,@p3 tinyint,@p4 bigint) -- ============================================================= -- Function: fn_FileStats_Read -- Written by: Colin Leversuch-Roberts -- www.kelemconsulting.co.uk -- (c) 6th Jan 2004 -- -- Purpose: Populates a computed column in table FileStats -- -- System: DBA maintenance -- -- Input Paramters: @p1 smalldatetime, @p2 tinyint, @p3 tinyint, @p4 bigint -- row timestamp, database id, file id, number of writes -- -- Returns : bigint -- -- Usage: Computed column in table { select dbo.fn_FileStats_Read(@p1,@p2,@p3,@p4) } -- -- Notes: This function calculates the number of reads recorded against the file id -- for the last hour. The writes, reads and i/o stall figures are cumlative hence -- the calulations required to work out the figures -- -- VERSION HISTORY -- Version No Date Description -- 1 6 jan - 2004 Initial Release -- ============================================================= returns bigint as begin declare @Return bigint,@Reads bigint select @Reads=Reads from dbo.filestats with (nolock) where convert(char(15),theDate,113)+'00'=convert(char(15),dateadd(hh,-1,@p1),113)+'00' and dbid=@p2 and fileid=@p3 -- have to cope with divide by zero problem -- rats! makes code less neat!! IF @p4=@Reads set @return=0 ELSE set @Return=abs(@p4-@Reads) --endif return(@Return) end -- -- -- CREATE function [dbo].[fn_FileStats_IOStall] (@p1 smalldatetime,@p2 tinyint,@p3 tinyint,@p4 bigint) -- ============================================================= -- Function: fn_FileStats_IOStall -- Written by: Colin Leversuch-Roberts -- www.kelemconsulting.co.uk -- (c) 6th Jan 2004 -- -- Purpose: Populates a computed column in table FileStats -- -- System: DBA maintenance -- -- Input Paramters: @p1 smalldatetime, @p2 tinyint, @p3 tinyint, @p4 bigint -- row timestamp, database id, file id, number of writes -- -- Returns : bigint -- -- Usage: Computed column in table { select dbo.fn_FileStats_IOStall(@p1,@p2,@p3,@p4) } -- -- Notes: This function calculates the iostall in ms recorded against the file id -- for the last hour. The writes, reads and i/o stall figures are cumlative hence -- the calulations required to work out the figures -- -- VERSION HISTORY -- Version No Date Description -- 1 6 jan - 2004 Initial Release -- ============================================================= returns bigint as begin declare @Return bigint,@IOStall bigint select @IOStall=IOStall from dbo.filestats with (nolock) where convert(char(15),theDate,113)+'00'=convert(char(15),dateadd(hh,-1,@p1),113)+'00' and dbid=@p2 and fileid=@p3 -- have to cope with divide by zero problem -- rats! makes code less neat!! IF @p4=@IOStall set @return=0 ELSE set @Return=abs(@p4-@IOStall) --endif return(@Return) end -- -- -- CREATE function dbo.fn_FileStats_AvIoStall2 (@p1 datetime,@p2 tinyint, @p3 tinyint,@p4 bigint,@p5 bigint,@p6 bigint) -- ============================================================= -- Function: fn_FileStats_AvIoStall2 -- Written by: (c) Colin Leversuch-Roberts -- www.kelemconsulting.co.uk -- -- Purpose: to produce an average time in ms for the file io -- -- System: DBA maintenance -- -- Input Paramters: @p1 datetime, @p2 tinyint, @p3 tinyint,@p4 bigint, @p5 bigint, @p6 bigint -- datetime , database id, file id, reads, writes, io stall -- -- Returns : decimal with 4 decimal points -- -- Usage: select dbo.fn_FileStats_AvIoStall2(@p1,@p2,@p3,@p4,@p5,@p6) -- -- Notes: function to calculate the average i/o stall per i/o for the last hour -- Can't do a direct function on the calculated columns as you -- can't use computed columns in a computed column, so do this way. -- -- VERSION HISTORY -- Version No Date Description -- 1 25-aug-2005 Initial Release -- -- ============================================================= returns decimal(9,4) as begin declare @reads bigint,@writes bigint,@iostall bigint,@return decimal(9,4) -- select @reads = dbo.fn_FileStats_Read(@p1,@p2,@p3,@p4) select @writes = dbo.fn_FileStats_Write(@p1,@p2,@p3,@p5) select @iostall = dbo.fn_FileStats_IOStall(@p1,@p2,@p3,@p6) -- IF @reads=0 and @writes=0 set @return = 0 ELSE set @return = abs(((@iostall*1.0)/(@reads+@writes))) --endif -- return(@return) end -- -- PS .. still struggling with creating blog entries , please accept my apologies for poor formatting!!

Published 24 November 2006 12:32 by GrumpyOldDBA

Comments

No Comments