January 2008 - Posts

Things I learned today

I'm involved with some tuning as usually tends to be the case and amongst the work I've been doing a couple of interesting points arose.

Firstly we upgraded the memory in the production cluster, what was unusual for me was that we added more memory than we had databases, I've never worked a production server that way before so it was with some interest to monitor memory usage, what has developed so far is that memory usage has climbed to about 1.6 times the data size of the production database. I'd always wondered about being able to sit a database in cache; we've obviously seen a dramatic drop in io and it appears there's also a drop in cpu - there's more time for monitoring to continue, it's a trended server so we have over a year of performance data collected so later on the comaprisions will be interesting.

Secondly, I've been looking at "worst" procs, this is a SQL 2000 server and the top worst report by io was using between 700k and 1 million io. The query itself is an 18 table join unioned to an 18 table join, when I extracted the plan almost every table was on the receiving end of a table/clustered index scan and there were no seeks at all. After eliminating parameter sniffing as a possible issue I started on the highest io table and added some indexes in an effort to remove the table scan ( 171k io ), however I didn't seem to be making much progress so i figured to port the database and query to SQL 2005, run the query a few times and make use of the index dmv's to give me a few clues.

I noticed after initial compilation the proc returned the result set ( 41 rows ) instantaneously, however the 2005 box is a bit more powerful and not getting the same usage as the 2000 dev/test box I was using so after a few calls I had a look at suggested indexes and to my surprise there was only one small index reported as missing.

Next stop was to examine the query plan and output the stats io to compare table io; to my surprise all my scans were now seeks and fair enough I had only one scan and this was the table shown to have a missing index. I couldn't be bothered to add all the io together but approx best guess probably puts it at around 100k or so ( there's some more changes to be made but I was looking for a quick win ).

Well we know the 2005 optimiser has been rewritten but it has to be said this was an absolute total shock, I'll be dropping in a SQL 2008 build and comparing here too, just for interest's sake.

Oh and before I get any comments, yes I'd updated stats, indexes and such on both databases and it was Enterprise 2000 vs Std 2005.

Perfmon Counters thresholds

Just in case you missed the link to The Premier's blog http://sqlblogcasts.com/blogs/thepremiers/archive/2008/01/23/monitoring-sql-server-using-performance-monitor.aspx  they've just posted details of a perfmon trace viewer. But what is more important is the attached documentation on perfmon counters, you can find as an attachment to the post, " Performance Counter Guidance.zip "

If you download this you'll find four documents which cover SQL Server, SSRS, SSAS and Windows Server, as well as listing the actual counters to monitor are explanations of likely values and causes. Important here is the coverage for SSRS and SSAS, I've been looking into the overall performance tuning and trending of SSRS and there's not a lot of info on tuning that I can see. It's worth pointing out that the samples which come with SQL 2005 install CD  include a log analysis application - or to be more precise a set of tables, a SSIS package to extract data from SSRS and some reports to interpret the data, a performance dashboard for SSRS. I've installed this and it does give excellent information and is a good starting point for starting to produce management reports on SSRS usage and performance.

I've not had cause to tune SQL 2005 Analysis Services, yet, but if you're using SSAS then I suggest you have a read though the document. There's not been too much for SQL Server 2005 on the microsoft blogs of late so it's a nice suprise to find such a gem. Thanks guys.

Tuning SQL Server configurations

One place to sometimes view configuration for SQL Server is the full disclosure documents for TPC benchmarks, yes I know I should get out more, however, it does sometimes make for interesting reading on exactly how some fantastic throughput is achieved in a test environment.

I'm still researching x64 configs, especially for windows 2003 rather than SQL Server and I just happened to find this KB article. Now I'm not suggesting we all rush out and implement any of this, but, it does make for interesting reading and maybe indicates there's more to tweaking SQL performance than meets the eye.

Were any of these implemented for the TPC benchmark I viewed, yes -T834 , and yes it was a x64 SQL Ent with 65GB ram with 8 cores. Makes you think!

As part of the disclosure there is an output of sp_configure - changes I could see ( the output was unformatted ! ) 

  • min memory per query (KB) = 512kb
  • awe enabled = 1 ( ! )
  • lightweight pooling = 1
  • max server memory (MB) = 63488
  • min server memory (MB) = 61440
  • max worker threads = 1024
  • priority boost = 1 ( ! )
  • recovery interval (min) = 32767
  • network packet size (B) = 32767

http://support.microsoft.com/kb/920093

http://msdn2.microsoft.com/en-us/library/aa366720.aspx

 

Configuring Windows 2003 ( x64 ) for SQL Server

There is an almost constant stream of posts on forums asking about configuration of SQL Server 2005 memory, especially for x64 editions. Alongside these are also a number of posts concerning performance or perceived performance issues with x64 SQL Server.

 

Now I don’t claim to have all the answers but what does concern me is the danger of the adoption of misinformation. I’m using these as pure illustrations, not saying that anything written or posted is wrong or otherwise, and I include myself here but, just because Tony or Simon post something doesn’t automatically make it gospel and on forums it’s far worse with many incorrect posts from people with strings of letters and fancy job titles after their name; it’s tempting to believe these posts have to be right,  my worry is that the seed of conjecture and misinformation becomes truth, somewhat like an urban myth.

 

So what’s all this about then – well x64 config really, I recently read a post where a respected SQL person suggested you could allocate more memory to sql server on a  x64 box because the o/s had a smaller footprint, e.g. maybe 62gb out of 64gb, so we’re talking serious hardware here.

 

For the last few months I’ve been attempting to understand in greater depth those aspects of the o/s and how they relate to SQL Server on x64 platform and exactly what does what including o/s patches which affect SQL Server.

 

I’m now of the opinion that there should be a dialog between the SQL teams and the Windows Server teams and that aspects of the o/s which may affect SQL Server should be brought to the fore. Yes I know we should all have the latest patches applied but I find many organisations unwilling to apply o/s patches for all manner of reasons.

 

So let’s consider memory first; x64 with 32gb ram, we can allocate 2 to 4 gb to the o/s and give the rest to SQL Server, or this is what we believe? Well not really.

 Let’s say this is a 4 way dual core box, windows will allocate 576 work threads, each of which requires 2mb of memory. In the illustration I viewed the calculations go thus:-
  • Let's leave 2GB for the OS and other applications: 2GB
  • And let's allocate 2GB for the MPA / Thread Stacks / Linked Servers etc: 2GB
  • And finally, let's reserve 3GB for all the other applications on the server (AV, backup etc): 3GB
  • So now our max server memory setting is: 32-2-2-3 = 25GB
  • Remember - this is a baseline calculation

The average Enterprise SQL Server will likely have fibre channel HBAs, be SAN attached, probably have teamed / load balanced HBAs and NICs, will have a suite of hardware monitoring programs, will have an Antivirus, will be running a monitoring program such as Tivoli or MOM and will probably have other agents and services running as part of your standard build. You may well have IIS, reporting services, SSIS, old dts, full text. Do you really believe these can all run within 2gb – including the operating system?

 

Btw. Multi page allocations exist outside SQL Server memory, also known as memtoleave in x32, if you’ve never run into this I recommend “SQL Internals” by Ken Henderson – a marvelous read.

 

Did I mention System Cache, under advanced settings are a number of options, typically system cache will be just under 1gb if you select it or 512mb if you don’t ( x32 ); except on x64 it can be 1TB.

 

We know we should enable lock pages in memory, this stops help reduce sql server memory being paged out, except you can’t do this for x64 STD edition.

 Well my research states that essentially awe locks pages in memory, pages enabled with awe may not be paged out – so to ensure your sql server memory remains for SQL Server enable awe in standard edition.

However, it is also stated that awe is not enabled in x64, even if you select it, and my basic tests seem to confirm this.

 Here’s a direct quote:-SQL Server 2005 does respond to memory pressure both when lock pages in memory is enabled and disabled. However there is a caveat that you need to be aware of - when OS hits memory pressure, depending on the state of the machine and applications,  it could be really slow - attempting to free up the resources. This means that SQL Server might not get enough CPU cycles to respond to memory pressure quickly and hence causing  other applications and possibly OS to return OOM errors. In case when lock pages in memory is not enabled it is possible that OS will page out SQL Server. This situation might result in inappropriate SQL Server performance.And another quote:-“The AWE functions exist on all editions of Windows and are usable regardless of how much physical memory a system has. However, AWE is most useful on systems with more than 2 GB of physical memory because it's the only way for a 32-bit process to directly use more than 2 GB of memory. Another use is for security purposes: because AWE memory is never paged out, the data in AWE memory could never have a copy in the paging file that someone could examine by rebooting into an alternate operating system.Finally, there are some restrictions on memory allocated and mapped by the AWE functions:·         Pages can't be shared between processes.·         The same physical page can't be mapped to more than one virtual address in the same process.·         On older versions of Windows, page protection is limited to read/write. In Windows Server 2003 Service Pack 1 and later, no access and read-only are supported.”

How can you check your memory? Well not with task manager, it seems, available memory may include memory allocated within o/s pools so this figure doesn’t actually mean this memory is not being used. The perfmon counter available Mb should be the counter watched, this should never drop below 200Mb ( the actual phrase used was "a couple of hundred Mb" )

 

Just in case you’ve got yourself a new 4 way quad core with HT and are rejoicing at 32 procs, SQL Server will have allocated even more worker threads and used even more memory. Just as an aside to this it seems windows can only support 64 processor threads, I don’t fully follow this part but it seems 64 procs was considered more than adequate but with quad core and HT you can reach this very quickly and give yourself major issues – but that’s another matter altogether.

 

So I think it’s fair to say that as DBAs we should allocate more memory for “other” processes, especially with x64.

 

I did also discover that enabling the 3GB switch in x32 reduces available memory and will likely increase paging as the working set is reduced, talking of which there is an o/s bug which can reset memory working sets for applications ( SQL Server ) when you use RDP to connect to a server, which I imagine every DBA probably does, nice, and did I mention the bug which causes problems with file copies? Oh and there’s the o/s schedulers bug too. ( all listed below )

 

Allocation of Processor Resources, I'm sure everyone knows this one but it should be set for Background Services.This setting alters the balance of cpu prioritisation between foreground and background.When set for Programs the foreground applications get processor priority, it is better to allow equal priority for all applications and their processes.

 

Pagefile sizing etc. The tuning of the page file is an interesting topic, I will dedicate an entire blog post to this later, let’s just say that placing your page file on a disk ( partitioned or not ) that shares with the o/s is probably a very bad idea.

 

You should monitor process handles as a matter of course, handles take resource and a rule of thumb apparently is that over 4,000, for a single process, is a big danger sign. I’ll also look into this further.

 

I’ve obviously skipped and jumped about over a number of areas and not covered any in depth, I’m hoping to perhaps provoke some dialog on x64 architecture because it has some challenges and I’d like to have some definitive answers as I’m a strong supporter of the platform.

 

I’ll just end with a quick plug for www.grumpyolddba.co.uk which will also have this and other information.

 

Listed below, links and references to sources used;

 

Microsoft® Windows® Internals, Fourth Edition: Microsoft Windows Server™ 2003, Windows XP, and Windows 2000

 

Tuning Windows Server 2003 on IBM System x Servers.

 The sizes of the working sets of all the processes in a console session may be trimmed when you use Terminal Services to log on to or log off from a computer that is running Windows Server 2003max worker threads OptionHow to reduce paging of buffer pool memory in the 64-bit version of SQL Server 2005You may experience a decrease in overall system performance when you are copying files that are larger than approximately 500 MB in Windows Server 2003 SP1 or in Windows Server 2003 SP2About Cache ManagerDoes SQL Server always respond to memory pressure?The Guru's Guide to SQL Server Architecture and InternalsSQL Server and the Working SetDo I have to assign the Lock Pages in Memory privilege for Local System

Memory Management 101

SQL Server 2000 or SQL Server 2005 may temporarily stop responding on a Windows Server 2003 Service Pack 1-based computer