General monitoring for SQL Server Analysis Services using Performance Monitor

A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.

I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blog.

 

This guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:

 

http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entry

 

Simulating Connections

 

To simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs.     File name: ASCMD_StressTestingScripts.zip. Performance Monitor

 

Within performance monitor,  a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand J

 

The counters used…… 
Object Counter Instance Justification
System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle.
System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase
Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor
Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor
Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault.
Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault.
Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided
Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications.
Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out
Processor Interrupts/sec _Total  The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter
Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server
Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM.
Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk.
Physical Disk Idle Time _total If Disk Transfers/sec is above  25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion.
Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk
Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation
Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS).
MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini
MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini
MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process.
MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache
MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly
MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry.
MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files.
MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query
MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance
MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second
MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock
MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue
MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file
MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file
 
Published Thursday, May 27, 2010 11:15 AM by Testas

Comments

No Comments