The most important part of performance tuning is to be able to establish a baseline. Without a baseline and on-going statistics and trending information you just cannot make informed judgements concerning performance degredation or improvement or on future growth.
The three parts presented here allow for a pretty good collection of data for trending etc. I know the subject isn't particularly exciting and probably doesn't appeal to many!! I have no reason to believe that I won't carry this on to SQL Server 2005, true there's lots of management views, but you need hard hsitorical data for analysis if your server falls over.
I hope I've explained it in such a manner it can be understood, catch me at a user group meeting if you need any further clarification.
I have three standard sets of data I collect as a minimum, here is the third.
This is by far the largest post and contains the most amount of code. Again I’m not claiming the concepts behind this post are unique, far from it, but I’ve never seen anyone put all of it together as a complete solution.
So what am I talking about now? I collect perfmon stats and other counters in a sql server table which I also rollup. The very wide report shown below is an indication of what I collect.
For those not in the know master..sysperfinfo stores all the sql server perfmon counter data which is constantly updated, it’s not all in the easiest format to understand but nevertheless it’s all there. All I do is extract the information every hour, calculate average values for the hour and store them in a table. Every day I rollup the hourly data into daily averages which I store in separate tables.
Keeping a history of all this data is important for trending and prediction.
Disclaimer :- The set of perfmon counters I chose were a personal choice, they are not “the best” , “the optimal” or anything like that. As I mentioned in part 2 of this series I have usually had the benefit of SQL Diagnostic Manager in the background.
In addition to the perfmon stats I also collect data from sp_monitor system procedure. It’s worth noting there is a bug in this procedure that means a value goes out of range .. I forget exactly which counter it is but if you get such an error all that is required is to recreate the procedure changing all integer variable to bigints. Remember sp_monitor is a system procedure so you must use sp_ms_marksystemobject to return it to a system procedure.
I've had to split the report to allow me to snapshot it:-

The code presented isn’t “generic” , for performance reasons, so you do need to make some modifications to the code for it to work, please note all the comments regarding these changes. Once again I use a DBA database, this time ServerAdmin, and I use scheduled jobs to store the data.
The raw data is stored every hour, I have two different summary tables, one which holds the average values for the entire day, the other which holds the average values for the key hours , say 08:00 to 17:00.
I have a job which creates the report shown above every night, I’m not publishing how I produce the reports, although I’ll try to remember to include the query, as that would require yet another long blog. I hope to show my “reporting solution” one evening at a user group meeting.
( As a footnote I’ve been producing reports from SQL Server in this manner since 6.5 and this particular report for the past 3 or 4 years )
This is my documentation for this process, I’m hoping the formatting will convert correctly from word, I apologise for not sitting down and writing this in proper html but it’s a large document!
1. Overview
- The SQL Server Performance Monitor counters are stored internally within SQL Server in the table Sysperfinfo which is located within the master database. The purpose of this table is briefly documented within Books On Line.
- The SQL Server Performance Counters are variously documented by Microsoft, a good source of reference is “SQL Server 2000 Performance Tuning Technical Reference” – Microsoft Press - ISBN 0-7356-1270-6
- The available sql counters are shown below, a number of counters are generated for each database
|
Auto-Param Attempts/sec AWE lookup maps/sec AWE stolen maps/sec
AWE unmap calls/sec AWE unmap pages/sec AWE write maps/sec
Backup/Restore Throughput/sec Batch Requests/sec Bulk Copy Rows/sec
Bulk Copy Throughput/sec Cache Use Counts/sec Checkpoint pages/sec
DBCC Logical Scan Bytes/sec Extent Deallocations/sec Extents Allocated/sec
Failed Auto-Params/sec Forwarded Records/sec Free list empty/sec
Free list requests/sec Free list stalls/sec FreeSpace Page Fetches/sec
FreeSpace Scans/sec Full Scans/sec Index Searches/sec
Latch Waits/sec Lazy writes/sec Lock Requests/sec
Lock Timeouts/sec Lock Waits/sec Log Bytes Flushed/sec
Log Cache Reads/sec Log Flush Waits/sec Log Flushes/sec
Logins/sec Logouts/sec Mixed page allocations/sec
Number of Deadlocks/sec Page Deallocations/sec Page lookups/sec
Page reads/sec Page Splits/sec Page writes/sec
Pages Allocated/sec Probe Scans/sec Range Scans/sec
Readahead pages/sec Safe Auto-Params/sec Scan Point Revalidations/sec
Shrink Data Movement Bytes/sec Skipped Ghosted Records/sec SQL Compilations/sec
SQL Re-Compilations/sec Table Lock Escalations/sec Transactions/sec
Unsafe Auto-Params/sec Workfiles Created/sec Worktables Created/sec |
- A set of System Statistical Functions are also available which record data.
|
Function |
Description ( from Books On Line ) |
|
@@CONNECTIONS |
Returns the number of connections, or attempted connections, since SQL Server was last started. |
|
@@CPU_BUSY |
Returns the time in milliseconds (based on the resolution of the system timer) that the CPU has spent working since SQL Server was last started. |
|
@@IO_BUSY |
Returns the time in milliseconds (based on the resolution of the system timer) that SQL Server has spent performing input and output operations since it was last started. |
|
@@PACK_RECEIVED
@@PACK_SENT |
Returns the number of input packets read from the network by SQL Server since last started.
Returns the number of output packets written to the network by SQL Server since last started. |
|
@@PACKET_ERRORS |
Returns the number of network packet errors that have occurred on SQL Server connections since SQL Server was last started. |
|
@@TOTAL_READ
@@TOTAL_WRITE |
Returns the number of disk reads (not cache reads) by SQL Server since last started.
Returns the number of disk writes by SQL Server since last started. |
|
@@TOTAL_ERRORS |
Returns the number of disk read/write errors encountered by SQL Server since last started. |
- Technically the descriptions given in BOL are correct although in practice the descriptions are incomplete.
- There is a system stored procedure sp_monitor which works in conjunction with a system table spt_monitor , both can be found within the master database, the values returned by the statistical functions correctly return the values since the last execution of the stored procedure OR the start of the SQL Server service which ever was the most recent.
- Knowing this fact we can extract the statistics for our own use by running sp_monitor at regular intervals.
- Sp_monitor is a documented system procedure, examination of the procedure shows that it uses the table spt_monitor ( which is not documented )
- A typical result set from this table is shown below
|
lastrun |
cpu_busy |
io_busy |
idle |
pack_received |
pack_sent |
connections |
pack_errors |
total_read |
total_write |
total_errors |
|
2006-01-25 09:01:03.703 |
22852929 |
735107 |
137159340 |
113903076 |
121546762 |
931224 |
532 |
9254818 |
64512368 |
0 |
2. Overview of collection Process
· To facilitate the collection process the functionality is handled by SQL Server in a user database.
· Performance data is extracted with stored procedures
· Performance data is stored in tables
· Collection is achieved using the Agent Job Scheduler
A snapshot of the statistical data is taken at regular intervals ( every hour ), the average value(s) for that period is then calculated and written to a table.
The disadvantage of this approach is that it does not provide any granularity in as much as any extremes will be masked.
The advantage is that the number of data sets is manageable and the collection has a minimal impact upon the server, if at all.
Once a day the daily data is rolled up into an average for the day, this allows for simple day to day comparison.
Thus the data collected is adequate for viewing patterns and trends but less adequate for detailed analysis.

Overview of hourly statistic gathering job.
( The database name ServerDBA is used as an indicator to show object placement )

3. Counters collected
|
Counter / Column |
Value |
Description |
|
TheDate |
Datetime – timestamp of table row entry |
Date time. The values for the counters represent the average for the hour ending shown in this column |
|
CacheHitRatio |
%age value (average) for the time period |
This is the total ratio for what is normally called the procedure cache.
Actually the cache holds plans for more than just procedures. ( see table 6 ) ( vii ) |
|
cpu_busy |
Milliseconds total for the time period
( i ) |
This is a measure of cpu use and indicates the ms of use in the hour period |
|
io_busy |
Milliseconds total for the time period
( i ) |
This is a measure of i/o use and indicates the ms of use in the hour period |
|
Logins |
Number of connected users (average) for the time period |
The number of persisted connection to the server. Each connection typically takes resource of around 48k memory. Connection memory cannot come from extended memory and thus unnecessary persisted connections can decrease the available memory for other processes. |
|
Connections |
Login attempts per second (average) for the time period |
The rate of connections and connection attempts. Where connections to sql server are using pooling if this figure is high then pooling may not be configured correctly |
|
PacketsReceived |
Packets received per second (average) for the time period |
Measure of network activity |
|
PacketsSent |
Packets sent per second (average) for the time period |
Measure of network activity |
|
PacketErrors |
Total of packet errors for the time period |
Should be nil |
|
Reads |
Reads per second (average) for the time period |
Count of page reads executed by SQL Server. Indication of server activity. |
|
Writes |
Writes per second (average) for the time period |
Count of page writes executed by SQL Server. Indication of server activity. |
|
RWErrors |
Total Number of read or write errors for the time period |
Should be nil |
|
TotalTransactions |
Transactions per second (average) for the entire server for the time period |
Measure of server activity |
|
UserDB1_Transactions |
Transactions per second (average) for the database for the time period |
Measure of database activity |
|
UserDB2_Transactions |
Transactions per second (average) for the database for the time period |
Measure of database activity |
|
UserDB2_Transactions |
Transactions per second (average) for the database for the time period |
Measure of database activity |
|
UserDB2_Transactions |
Transactions per second (average) for the database for the time period |
Measure of database activity |
|
TempdbTransactions |
Transactions per second (average) for the database for the time period |
Measure of tempdb activity. Tempdb is the working area for SQL Server. High levels of transactions in tempdb relative to the overall server figure may be of interest ( ii ) |
|
FullScans |
Number of unrestricted full scans per second (average) for the time period |
Number of unrestricted full scans per second. These can be either base-table or full-index scans. ( iii )
This figure should be low |
|
IndexSearches |
Number of index lookups per second (average) for the time period |
This counter shows the number of index searches SQL Server is performing. SQL Server uses index searches to start range scans, to retrieve a specific record by using one index, and to reposition within an index. |
|
PageSplits |
Number of page splits per second (average) when an index page overflows for the time period |
This figure should be low.
( iv ) |
|
LockRequests |
The total number of lock requests per second (average) for the time period |
Locks have to exist within lower memory, excessive lock requests may result in memory issues, likewise other factors may reduce the available memory for locks
If locks become an area of concern greater granularity can be obtained by measuring lock types ( v ) |
|
LockTimeouts |
The number of times per second (average) that a lock could not be obtained when requested for the time period |
Excessive lock timeouts may indicate application problems, memory pressures and/or excessive number of requests ( v ) |
|
Deadlocks |
The total number of deadlocks that occurred during the monitored time period |
This counter should be zero |
|
LatchWaits |
The number of waits per second (average) on latch requests. |
Latches are lightweight “locks” used to aid performance. This figure should be low. A high figure would indicate resource issues, probably memory. |
|
LockWaits |
The number of times per second (average) that a thread had to wait for a lock during the monitored time period |
This value should be low ( v ) if this counter is high it may indicate performance issues. |
|
Compilations |
The number of times per second (average) that sql compilations occur for the monitored time period |
Used as a measure of work |
|
WorkFiles |
The number of workfiles, such as created internally during query execution, per second (average) for the monitored time period. |
Used as a measure of work
( vi ) |
|
WorkTables |
The number of worktables, such as created internally during sorts, unions, groupings, per second (average) for the monitored time period. |
|