Build Your Own Cluster

In case you've all missed this you can now build a real cluster with just three machines

http://blogs.msdn.com/clustering/archive/2009/05/14/9613924.aspx

Windows 2008 Storage Server is now available for msdn and technet download, as the man says now's there's no excuse to not have your own cluster.

Posted by GrumpyOldDBA with no comments

Storage – The Final Frontier

This is the presentation for the UK SSUG that I gave the other night. The link below is a pdf of all the slides and the script / notes that I wrote to go with each slide.  I’ve not edited this for style or anything – this is as I worked out what I’d say so some of it may not make complete grammatical sense.

Please feel free to download.

I hope to add some posts to clarify some of the sections, I’ve been working on one about IOPS for a while.

Storage the final frontier pdf

Posted by GrumpyOldDBA with no comments
Filed under: , ,

Parallel Worlds or slipped reality?

I’ve been planning a post on the subject of parallelism for some considerable time but the inability to be able to present reproducible tests for illustration had been a major concern. Things change however, I picked up on an interesting post  http://sqlblog.com/blogs/aaron_bertrand/archive/2009/03/21/six-reasons-you-should-be-nervous-about-parallelism.aspx

I hope you’ll grant me a quick rant here < grin >, I spend much of my time involved in the performance tuning of databases and something I frequently encounter are questionable settings applied to servers and databases based upon a blog post or forum post. The internet is a wonderful source of information but it is also a wonderful source of misinformation and misguided best intentions. It was my view that there was a danger this post could be4 taken as a reason to disable parallelism but it was within the run of comments that surprised me.

Over the last few years I have been involved with extensive testing of migrations of sql 2000 to sql 2005 / 2008 for several clients, this involved parallelism as these boxes ranged from 8 cores / processors to 32 cores, In fact I’ve always used multiple cpu servers since the Pentium Pro. I had commented that my testing had seen performance degradation of up to 600% where  parallelism had been disabled, this had produced a criticism that I had not defined if this was an OLTP or DSS system.

OK, so the obvious reaction is “What the *!!!* ! “ What on earth does a classroom definition of databases have to do with anything, performance degradation is performance degradation.  There was also mention of the TPC benchmarks, again another classroom exercise that has little to do with real world. I give an example, here in the UK every car has to have fuel consumption figures published, my wife was disappointed to discover her new car could only manage 18 miles to the gallon despite the handbook suggesting at least 26. My own car at best only manages 60% of the manufacturer’s suggestion. Why is this? Because the tests do not reflect real world driving, yes they give give a way to compare car to car, just like the TPC benchmarks allow us to compare a server configuration – but real world – not a chance. ( And yes I have read many of the full disclosure documents for TPC benchmarks, I also read storage equivalents. )

So what type of applications have I worked with, well without being too specific Accounting, CRM, ERP, Treasury Systems, Loan Originations, online loan applications, price comparison, call centre applications, loan broking, Billing systems and a few Data Warehouse and OLAP solutions. Do I find the same “ issues “ with them all – basically yes ( I’m not insinuating all my clients’ systems were problematic btw. ) I can’t say I’ve ever thought “Oh OLTP – can’t do parallelism here”

As you might guess I hate sweeping generalisations and comments like “ I always xxxx on a yyy system “ just indicates a closed mind and tunnel vision to me. I like to think I approach every situation with an open mind to all things SQL Server, I’ve encountered enough interesting aspects of tuning to fill a book, sadly of course you can’t take client databases and data to make examples so the book will never get written.

Now to the serious side of parallelism settings, it’s a server wide setting so it affects every database on a server when you make a change, so altering this can have wide ranging effects. Although I’ve never disabled parallelism and then used maxdop = n hints to enable it for specific queries, I understand that the optimiser doesn’t actually take a lot of notice and will split the query across all available cores, so on a 16 core box it will split a query with a maxdop = 2  across 16, not what you’d actually want. As I say I’ve never tried this and currently i don’t have any 16 core or greater boxes to use for such a test.

One of the points about Servers is that Moore’s Law no longer applies, my 6 year old lap top has a 2.8 GHz cpu, actually faster than many of the multi core servers I use today. True the modern cpu has more cache, bus speed is faster, so is memory, but bottom line we are not getting faster core cpu speed – the future is parallel processing and I understand Microsoft are indeed working on the improvements to in memory parallel processing of queries within SQL Server.

I tend to work with reasonable size databases, I guess that it’s no surprise databases are getting bigger, I’m used to tables with tens and hundreds of millions of rows and I expect to reach billion(s) very soon. Partitioning or no partitioning,  even a partitioned billion row table is likely to have hundreds of millions of rows per partition, you need parallel plans to query these types of tables if you want decent performance.

And while I’m having a rant – multiple concurrent connections – yeah – I’ve been there too on moderate thousand user systems, ever actually monitored concurrent running processes on your SQL Server? Well I have – down to 1 second granularity – it can be truly amazing how few queries actually run at the same time. Hyperthreading – how many DBA’s actually know what the problem is/was with HT – I can’t say that in interviews I’ve found many that do ( I’ll not count thread starvation which is a real issue with multiple cores and parallelism or how many workers sql server allocates and how much memory they require on an x64 system, and we have 6 core cpus , now that’s a concern )

CXPacket waits – well any parallel query will tend to show CXPacket as the last wait, but it’s understanding if you have blocking threads, again not many DBAs I’ve interviewed even have a brief working knowledge of waits, not that I claim to have all the answers there!

So what I’m really saying is that I’m worried this post will be picked up as a reason to disable parallelism, because many readers won’t fully understand and some of the comments are quite seriously worrying. A very interesting discussion none the less. Probably earned my Grumpy status today!

How to determine that hardware DEP is available and configured on your computer

Well here we are again with another post I’d been saving up until I resolved the images issue. DEP and how it affects memory on your SQL Server!!!

Microsoft KB: 912923    MICROSOFT KB: 875352

=======================================================================================

Notes on these KB:

  • I found there to be some confusion concerning DEP, when it’s enabled you do not need the /PAE switch to enable awe memory ( x32 ); you don’t need this switch with x64 at all;
  • What this means is that in certain circumstances the memory can all be taken by SQL Server, my tests showed it would leave 128MB for the o/s, so it is very important to always set the max memory setting for SQL Server to avoid an unresponsive server.
  • I’ve added some screen shots to this KB to clarify, other than that it is of course exactly as published.

“To use these processor features, the processor must be running in Physical Address Extension (PAE) mode. However, Windows will automatically enable PAE mode to support DEP. Users do not have to separately enable PAE by using the /PAE boot switch.
Note Because 64-bit kernels are Address Windowing Extensions (AWE) aware, there is not a separate PAE kernel in 64-bit versions of Windows.” - From KB 875352

====================================================================================

v INTRODUCTION

v MORE INFORMATION

v Requirements for using hardware-enforced DEP

v How to confirm that hardware DEP is working in Windows

v Method 1: Use the Wmic command-line tool

v Method 2: Use the graphical user interface

v Notes

INTRODUCTION

  • Data Execution Prevention (DEP) is a set of hardware and software technologies that perform additional checks on memory to help protect against malicious code exploits.
  • Hardware-enforced DEP marks all memory locations in a process as non-executable unless the location explicitly contains executable code. One kind of malicious code attacks tries to insert and run code from non-executable memory locations. DEP helps prevent these attacks by intercepting them and raising an exception.
  • This article describes the requirements for using hardware-enforced DEP. This article also describes how to confirm that hardware DEP is working in Windows.

MORE INFORMATION

Requirements for using hardware-enforced DEP

To use hardware-enforced DEP, you must meet all the following conditions:

1. The computer's processor must support hardware-enforced DEP.

Many recent processors support hardware-enforced DEP. Both Advanced Micro Devices (AMD) and Intel Corporation have defined and shipped Windows-compatible architectures that are compatible with DEP. This processor support may be known as NX (no-execute) or XD (execute disable) technology. To determine whether your computer's processor supports hardware-enforced DEP, contact the manufacturer of your computer.

2. Hardware-enforced DEP must be enabled in the BIOS.

On some computers, you can disable processor support for hardware-enforced DEP in the BIOS. You must not disable this support. Depending on your computer manufacturer, the option to disable this support may be labeled "Data Execution Prevention," "XD," "Execute Disable," or "NX."

3. The computer must have Windows XP with Service Pack 2 or Windows Server 2003 with Service Pack 1 installed.

Note Both 32-bit versions and 64-bit versions of Windows support hardware-enforced DEP. Windows XP Media Center Edition 2005 and Microsoft Windows XP Tablet PC Edition 2005 include all the features and components of Windows XP SP2.

4. Hardware-enforced DEP must be enabled for programs on the computer.

In 64-bit versions of Windows, hardware-enforced DEP is always enabled for 64-bit native programs. However, depending on your configuration, hardware-enforced DEP may be disabled for 32-bit programs.

For more information about DEP, about DEP configuration, and for a list of Windows Operating Systems that support hardware DEP, see the following article which follows on from this KB 875352 ( below )

“  (KB 875352  A detailed description of the Data Execution Prevention (DEP) feature in Windows XP Service Pack 2, Windows XP Tablet PC Edition 2005, and Windows Server 2003 “

How to confirm that hardware DEP is working in Windows

  • To confirm that hardware DEP is working in Windows, use one of the following methods.

Method 1: Use the Wmic command-line tool

  • You can use the Wmic command-line tool to examine the DEP settings. To determine whether hardware-enforced DEP is available, follow these steps:

1. Click Start, click Run, type cmd in the Open box, and then click OK.

2. At the command prompt, type the following command, and then press ENTER:

wmic OS Get DataExecutionPrevention_Available

If the output is "TRUE," hardware-enforced DEP is available.

To determine the current DEP support policy, follow these steps.

1. Click Start, click Run, type cmd in the Open box, and then click OK.

2. At the command prompt, type the following command, and then press ENTER:

wmic OS Get DataExecutionPrevention_SupportPolicy

** You have to logged on as an administrator to run this otherwise you get this message.

 

v The value returned will be 0, 1, 2 or 3. This value corresponds to one of the DEP support policies that are described in the following table.

 

DataExecutionPrevention_SupportPolicy property value

Policy Level

Description

2

OptIn (default configuration)

Only Windows system components and services have DEP applied

3

OptOut

DEP is enabled for all processes. Administrators can manually create a list of specific applications which do not have DEP applied

1

AlwaysOn

DEP is enabled for all processes

0

AlwaysOff

DEP is not enabled for any processes

  • Note To verify that Windows is running with hardware DEP enabled, examine the DataExecutionPrevention_Drivers property of the Win32_OperatingSystem class.
  • In some system configurations, hardware DEP may be disabled by using the /nopae or /execute switches in the Boot.ini file. To examine this property, type the following command at a command prompt:

wmic OS Get DataExecutionPrevention_Drivers

  • Take great care if you intend to examine boot.ini
  • This file is normally hidden so you will need to alter folder options to be able to see it.

Method 2: Use the graphical user interface

v To use the graphical user interface to determine whether DEP is available, follow these steps:

1. Click Start, click Run, type wbemtest in the Open box, and then click OK.

2. In the Windows Management Instrumentation Tester dialog box, click Connect.

3. In the box at the top of the Connect dialog box, type root\cimv2, and then click Connect.

4. Click Enum Instances.

5. In the Class Info dialog box, type Win32_OperatingSystem in the Enter superclass name box, and then click OK.

6. In the Query Result dialog box, double-click the top item.

Note This item starts with "Win32_OperatingSystem.Name=Microsoft..."

7. In the Object editor dialog box, locate the DataExecutionPrevention_Available property in the Properties area.

8. Double-click DataExecutionPrevention_Available.

9. In the Property Editor dialog box, note the value in the Value box.

If the value is TRUE, hardware DEP is available.

  • Screenshot of the GUI

Notes

  • To determine the mode in which DEP is running, examine the DataExecutionPrevention_SupportPolicy property of the Win32_OperatingSystem class. The table at the end of Method 1 describes each support policy value.
  • To verify that hardware DEP is enabled in Windows, examine the DataExecutionPrevention_Drivers property of the Win32_OperatingSystem class. In some system configurations, hardware DEP may be disabled by using the /nopae or /execute switches in the Boot.ini file.

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, regarding the performance or reliability of these products.


See KB /912923 for affected systems.

See KB 875352 for affected systems.

A detailed description of the Data Execution Prevention (DEP) feature in

Windows XP Service Pack 2, Windows XP Tablet PC Edition 2005, and Windows Server 2003

KB: 875352

  • · SUMMARY
  • · INTRODUCTION
  • · MORE INFORMATION
  • · Hardware-enforced DEP
  • · Software-enforced DEP
  • · Benefits
  • · System-wide configuration of DEP
  • · Per-program DEP configuration
  • · REFERENCES

SUMMARY

Data Execution Prevention (DEP) is a set of hardware and software technologies t...

Data Execution Prevention (DEP) is a set of hardware and software technologies that perform additional checks on memory to help prevent malicious code from running on a system. In Microsoft Windows XP Service Pack 2 (SP2) and Microsoft Windows XP Tablet PC Edition 2005, DEP is enforced by hardware and by software.
The primary benefit of DEP is to help prevent code execution from data pages. Typically, code is not executed from the default heap and the stack. Hardware-enforced DEP detects code that is running from these locations and raises an exception when execution occurs. Software-enforced DEP can help prevent malicious code from taking advantage of exception-handling mechanisms in Windows.

INTRODUCTION

This article describes the DEP feature in Windows XP SP2 and in Microsoft Window...

This article describes the DEP feature in Windows XP SP2 and in Microsoft Windows Server 2003 with Service Pack 1 (SP1) and discusses the following topics:

· Hardware-enforced DEP

· Software-enforced DEP

· Benefits

· System-wide configuration of DEP

· Per-program DEP configuration

MORE INFORMATION

Hardware-enforced DEPHardware-enforced DEP marks all memory locations in a proce...

Hardware-enforced DEP

Hardware-enforced DEP marks all memory locations in a process as non-executable unless the location explicitly contains executable code. A class of attacks exists that tries to insert and run code from non-executable memory locations. DEP helps prevent these attacks by intercepting them and raising an exception.
Hardware-enforced DEP relies on processor hardware to mark memory with an attribute that indicates that code should not be executed from that memory. DEP functions on a per-virtual memory page basis, and DEP typically changes a bit in the page table entry (PTE) to mark the memory page.
Processor architecture determines how DEP is implemented in hardware and how DEP marks the virtual memory page. However, processors that support hardware-enforced DEP can raise an exception when code is executed from a page that is marked with the appropriate attribute set.
Advanced Micro Devices (AMD) and Intel have defined and shipped Windows-compatible architectures that are compatible with DEP.
Beginning with Windows XP SP2, the 32-bit version of Windows uses one of the following:

· The no-execute page-protection (NX) processor feature as defined by AMD.

· The Execute Disable Bit (XD) feature as defined by Intel.

To use these processor features, the processor must be running in Physical Address Extension (PAE) mode. However, Windows will automatically enable PAE mode to support DEP. Users do not have to separately enable PAE by using the /PAE boot switch.
Note Because 64-bit kernels are Address Windowing Extensions (AWE) aware, there is not a separate PAE kernel in 64-bit versions of Windows.
For more information about PAE and AWE in Windows Server 2003, click the following article number to view the article in the Microsoft Knowledge Base:

283037 (http://support.microsoft.com/kb/283037/ ) Large memory support is available in Windows Server 2003 and in Windows 2000

Software-enforced DEP

An additional set of Data Execution Prevention security checks have been added to Windows XP SP2. These checks, known as software-enforced DEP, are designed to block malicious code that takes advantage of exception-handling mechanisms in Windows. Software-enforced DEP runs on any processor that can run Windows XP SP2. By default, software-enforced DEP helps protect only limited system binaries, regardless of the hardware-enforced DEP capabilities of the processor.

Benefits

The primary benefit of DEP is that it helps prevent code execution from data pages, such as the default heap pages, various stack pages, and memory pool pages. Typically, code is not executed from the default heap and the stack. Hardware-enforced DEP detects code that is running from these locations and raises an exception when execution occurs. If the exception is unhandled, the process will be stopped. Execution of code from protected memory in kernel mode causes a Stop error.
DEP can help block a class of security intrusions. Specifically, DEP can help block a malicious program in which a virus or other type of attack has injected a process with additional code and then tries to run the injected code. On a system with DEP, execution of the injected code causes an exception. Software-enforced DEP can help block programs that take advantage of exception-handling mechanisms in Windows.

System-wide configuration of DEP

DEP configuration for the system is controlled through switches in the Boot.ini file. If you are logged on as an administrator, you can now easily configure DEP settings by using the System dialog box in Control Panel.
Windows supports four system-wide configurations for both hardware-enforced and software-enforced DEP.

Configuration

Description

OptIn

This setting is the default configuration. On systems with processors that can implement hardware-enforced DEP, DEP is enabled by default for limited system binaries and programs that "opt-in." With this option, only Windows system binaries are covered by DEP by default.

OptOut

DEP is enabled by default for all processes. You can manually create a list of specific programs that do not have DEP applied by using the System dialog box in Control Panel. Information technology (IT) professionals can use the Application Compatibility Toolkit to "opt-out" one or more programs from DEP protection. System compatibility fixes, or shims, for DEP do take effect.

AlwaysOn

This setting provides full DEP coverage for the whole system. All processes always run with DEP applied. The exceptions list to exempt specific programs from DEP protection is not available. System compatibility fixes for DEP do not take effect. Programs that have been opted-out by using the Application Compatibility Toolkit run with DEP applied.

AlwaysOff

This setting does not provide any DEP coverage for any part of the system, regardless of hardware DEP support. The processor does not run in PAE mode unless the /PAE option is present in the Boot.ini file.

Hardware-enforced and software-enforced DEP are configured in the same manner. If the system-wide DEP policy is set to OptIn, the same Windows core binaries and programs will be protected by both hardware-enforced and software-enforced DEP. If the system cannot use hardware-enforced DEP, the Windows core binaries and programs will be protected only by software-enforced DEP.
Similarly, if the system-wide DEP policy is set to OptOut, programs that have been exempted from DEP protection will be exempted from both hardware-enforced and software-enforced DEP.
The Boot.ini file settings are as follows:

/noexecute=policy_level

Note policy_level is defined as AlwaysOn, AlwaysOff, OptIn, or OptOut.
Existing /noexecute settings in the Boot.ini file are not changed when Windows XP SP2 is installed. These settings are also not changed if a Windows operating system image is moved across computers with or without hardware-enforced DEP support.
During installation of Windows XP SP2 and Windows Server 2003 SP1 or later versions, the OptIn policy level is enabled by default unless a different policy level is specified in an unattended installation.

If the /noexecute=policy_level setting is not present in the Boot.ini file for a version of Windows that supports DEP, the behavior is the same as if the /noexecute=OptIn setting was included.
If you are logged on as an administrator, you can manually configure DEP to switch between the OptIn and OptOut policies by using the Data Execution Prevention tab in System Properties. The following procedure describes how to manually configure DEP on the computer:

1. Click Start, click Run, type sysdm.cpl, and then click OK.

2. On the Advanced tab, under Performance, click Settings.

3. On the Data Execution Prevention tab, use one of the following procedures:

    • Click Turn on DEP for essential Windows programs and services only to select the OptIn policy.
    • Click Turn on DEP for all programs and services except those I select to select the OptOut policy, and then click Add to add the programs that you do not want to use the DEP feature.

4. Click OK two times.

IT professionals can control system-wide DEP configuration by using a variety of methods. The Boot.ini file can be modified directly with scripting mechanisms or with the Bootcfg.exe tool that is included in Windows XP SP2.
To configure DEP to switch to the AlwaysOn policy by using the Boot.ini file, follow these steps:

1. Click Start, right-click My Computer, and then click Properties.

2. Click the Advanced tab, and then click Settings under the Startup and Recovery field.

3. In the System startup field, click Edit. The Boot.ini file opens in Notepad.

4. In Notepad, click Find on the Edit menu.

5. In the Find what box, type /noexecute, and then click Find Next.

6. In the Find dialog box, click Cancel.

7. Replace policy_level with AlwaysOn.
WARNING Make sure that you enter the text accurately. The Boot.ini file switch should now read:

/noexecute=AlwaysOn

8. In Notepad, click Save on the File menu.

9. Click OK two times.

10. Restart the computer.

For unattended installations of Windows XP SP2 or later versions, you can use the Unattend.txt file to pre-populate a specific DEP configuration. You can use the OSLoadOptionsVar entry in the [Data] section of the Unattend.txt file to specify a system-wide DEP configuration.

Per-program DEP configuration

For the purposes of program compatibility, you can selectively disable DEP for individual 32-bit programs when DEP is set to the OptOut policy level. To do this, use the Data Execution Prevention tab in System Properties to selectively disable DEP for a program. For IT professionals, a new program compatibility fix that is named DisableNX is included with Windows XP SP2. The DisableNX compatibility fix disables Data Execution Prevention for the program that the fix is applied to.
The DisableNX compatibility fix can be applied to a program by using the Application Compatibility Toolkit. For more information about Windows application compatibility, see Windows Application Compatibility on the following Microsoft

899298 (http://support.microsoft.com/kb/899298/ ) The "Understanding Data Execution Prevention" help topic incorrectly states the default setting for DEP in Windows Server 2003 Service Pack 1

x64 Memory Problems

(  I’m hopeful that Livewriter will enable me to publish a blog post which includes screen shots as this one has a large number! )

Please READ everything and the documentation with the download before you install this

I’ve blogged about memory issues before, and I’ve posted quite a few times on forums possible solutions to memory issues with SQL Server.    Configuring Windows 2003 x64 for SQL Server

There seems to be far more misinformation concerning x64 memory than there ever was with x32 but as I personally think 32 bit is now a thing of the past I won’t concern myself with x32 memory.

So this is only about x64.

That said, I’ve been making use of a utility called cacheset.exe both to set the Large System Cache and to monitor its use on Production SQL Servers for quite some time now, I have always been aware of what used to be the Sysinternals utilites from way back. This utility does work on both x32 and x64 systems.

cacheset.exe

Cacheset .exe showing current sizes. You can use this utility to set the working size but a reboot will lose the settings.

( You can also use this on the command line as a startup batch command. )

Large System Cache is a part of the advanced settings for Windows, remember that originally Windows Server was most likely seen as a file server rather than a database server and the requirements for the two are somewhat different.

The Memory Usage option alters the way the large system cache is allocated.

For SQL Server it should be Programs as shown below.

 perfoptions2

This is all well and good but in x64 land the system cache size defaults to the size of all available memory in the server, this is not good and if you cannot enable lock pages in memory then there is a risk of other activities including file caching paging out the memory used by SQL Server.

Even setting lock pages in memory ( Enterprise Edition ) isn’t really a complete solution and I found in certain circumstances the following bug had an impact on the file cache size.

http://www.grumpyolddba.co.uk/infrastructure/FileCopyBugW2003.htm

You should apply this registry fix to all your production SQL Servers as a matter of course anyway.

 

If you’re using Task Manager, I prefer Process Explorer, the one section you really need to watch to expose memory issues is the Commit Charge, if this exceeds physical memory then you will have problems. In the screen shot above you can see I have 64GB of memory, the graph indicates a use of 57.8GB which leaves about 6GB free

Commit Charge is about 60GB , it’s the total of real and virtual memory being used. If you live search you’ll find the Exchange guys have been battling with server memory for some years!

The following table indicates how to calculate the virtual cache size:

Architecture

Virtual cache size

IA-32

128 megabytes (MB) + (Physical memory - 16 MB) / 4 MB x 64 MB

IA-64

128 MB + (Physical memory - 16 MB) / 8 MB x 64 MB

The following table indicates the maximum and the minimum limits for the virtual cache size calculation:

Architecture

Minimum virtual cache size

Maximum virtual cache size

IA-32 (LargeCacheSize value = 0)

Not available

512 MB

IA-32 (LargeCacheSize value = 1)

Not available

960 MB

IA-64

128 MB

1 terabyte

 

 

Here’s something you’ll have to expect with x64, SSIS has just taken over 3GB of memory; in the case where I pulled this screen shot there were two SSIS packages running on a SQL Server with 8GB of ram ( and yes I will say that 8GB of ram for a serious x64 server is well under endowed !! ) Every 64bit application can also take all available memory, in this case the SSIS packages failed after some time, but at the expense of making the SQL Server almost comatose. Be very careful with what you install on your x64 SQL Servers. This screen shot comes from Process Explorer – I’m working on a blog post about Process Explorer, pictures in the blog have always been problematic and I’m working hard to find solutions to help me transfer my posts without losing the formatting and pictures, with varying success it has to be said!!

As we all know when you deploy SQL 2008 / SQL 2005 STD Edition you can’t enable lock pages in memory which means that SQL Server can be paged out. Well those nice Microsoft chaps have come up with a solution to the x64 System Cache, it involves a service and a registry change – there’s actually lots of options but I’ll just cover the basic setup here.

I've said before that there never seems any co-ordination of output from Microsoft, yes I know it's a big company and will suffer as all big companies seem to suffer from poor communications, but in february the Windows Debug team published this somewhat interesting post.

http://blogs.msdn.com/ntdebugging/archive/2009/02/06/microsoft-windows-dynamic-cache-service.aspx

I've been running this service for a month now ( on a x64 Server ) without any signs of any problems: What I thought I'd do was mention the original post and put out screen shots illustrating the install and so-on, I realise not everyone is confident following written instructions and a picture speaks a thousand words!

( The blog will take you to a download for this component.)

What follows is from the blog post – italics are mine

==============================================================================

The Microsoft Windows Dynamic Cache Service will manage the working set size of the Windows System File Cache. For 64 bit systems this service helps to address the problem of excessive cached read I/O that could eventually consume all of physical memory. This problem is described in this blog post:

http://blogs.msdn.com/ntdebugging/archive/2007/11/27/too-much-cache.aspx

This service will only run on Windows Server 2008 or earlier versions of Windows. Do not attempt to run this service on a version of Windows after Windows Server 2008 as it will most likely cause performance problems.

Installation

1) Copy DynCache.exe to %SystemRoot%\System32.

2) From a command prompt, run:

sc create DynCache binpath= %SystemRoot%\System32\DynCache.exe start= auto type= own DisplayName= "Dynamic Cache Service"

  • you should get this message ( below ) and the service should be shown in services

  • ( don't forget to make sure the service is actually running )

3) Import the DynCache.reg registry file. This registry file contains default settings that you will probably want to modify. ( double click on DynCache.reg )

  • you should receive this message ( below ) after clicking YES

  • searching the registry with regedit for DynCache will show as follows ( below ):

 

  • Here's the view ( above ) before installing the service and making a change to the registry, as you can see the entire 64GB of installed memory is available for system cache.

  • Here I've set the max system cache to 1GB and this is reflected in the setting shown in Cacheset ( above )
  • The last step is to add the dependency upon SQL Server, add the new key as shown below

 

Make sure you read the document ReadMe.docx that comes with the download carefully before you do anything to your server(s)

Reasons to move to SQL 2008 MERGE pt 2

In my previous post with the MERGE statement it was assumed that the pairs of data would be unique and the second set would not already exist. I can’t say that I’d actually put together a really good example and I’m not saying that this is the most brilliant of examples but it possibly makes more sense than the last!

The brief description here is that here we’re adding unique people to our database and adding a lookup to their occupation. What we want to do is pass a batch of people and their job title to a procedure, we’d like the personal details to pick up the foreign key for the Job Title, but we’d also like to add any job titles we don’t already have – all at the same time!

As I think I’d mentioned previously the MERGE statement cannot deal with duplicate entries and will throw an error which ends the batch. This piece of code gets around this and will cope with duplicate Job Titles, Existing Job Titles and New Job Titles.

I’m assuming you’d not want duplicate PersonNames.

  • The code below will create and populate two tables with some data, JobTitle and People, as before the columns that we’re using to merge on must be unique.
  • Once the tables have been populated we fill a table variable with some values and then merge it into the JobTitles first picking up the foreign key and then into the Person Table.
  • @max is set to the maximum number of duplicates and then we loop though the merge using distinct values adding any new job titles.
  • The output from the MERGE is passed into another table variable which is then joined with the original table to be merged into the Person table. ( To capture the ID of the already existing JobTitles I’m just updating the created date )

I’ve left the output as selects so you can see clearly what is happening.

My personal view is that SQL 2008 BOL is pretty rubbish, I wasn’t too pleased with 2005 BOL and I think a lot of the careful work in SQL 2000 BOL has been lost – try looking up Federated Servers – just like Vista too many clicks leading to other clicks which don’t actually lead you anywhere!  At least the MERGE statement is in one section, but I don’t find the explanations or the syntax very easy to follow.

Very powerful stuff, like the pivot command – but a nightmare to follow!

--
-- create base tables and data
--
IF OBJECT_ID('dbo.JobTitle', 'U') IS NOT NULL DROP TABLE dbo.JobTitle;
create Table dbo.JobTitle
( JobID int identity(1,1) not null,
JobName varchar(50) not null,
CreateDate datetime not null
);
Alter table dbo.JobTitle add constraint PK_JobTitle Primary Key Clustered (JobID);
alter table dbo.JobTitle add constraint UK_JobTitle_Name unique (JobName);
--
insert into dbo.JobTitle(JobName,CreateDate)
values
('Welder',GETDATE()-1),
('Fabricator',GETDATE()-2),
('Roof Carpenter',GETDATE()-3),
('Roof Tiler',GETDATE()-4),
('Plumber',GETDATE()-2),
('Electrician',GETDATE()-4),
('Dog Walker',GETDATE()-3),
('Mouse Handler',GETDATE()-2);
--
IF OBJECT_ID('dbo.People', 'U') IS NOT NULL DROP TABLE dbo.People;
create table dbo.People
( PersonID int identity(1,1) not null,
  PersonName varchar(50) not null,
  JobID int not null,
  CreateDate datetime not null
);
Alter table dbo.People add constraint PK_People Primary Key Clustered (PersonID);
alter table dbo.People add constraint UK_People_Name unique (PersonName);
--
insert into dbo.People(PersonName,JobID,CreateDate)
values
('John Smith',1,GETDATE()-1),
('Andy Williams',2,GETDATE()-2),
('Buddy Holly',3,GETDATE()-3),
('Ozzie Osbourne',4,GETDATE()-4),
('Sharleen Spiteri',5,GETDATE()-2),
('Christoph Doom Schneider',6,GETDATE()-4),
('Paul Landers',7,GETDATE()-3),
('Oliver Riedel',8,GETDATE()-2);
--
--
DECLARE @Table table( SeqKey INT identity(1,1) NOT NULL PRIMARY KEY,
  JobName varchar(50) not null,
  PersonName varchar(50) not null);
  declare @max tinyint;
--
INSERT INTO @Table(JobName,PersonName) 
VALUES
('Welder','Jonas Renkse'),
('Welder 2nd Class','Anders Nystrom'),
('Grave Digger','Fred Norrman'),
('Assistant Chef','Mattias Norrman'),
('Welder','Daniel Liljekvist'),
('Mouse Walker','Anne-Marie Helder'),
('Mouse Handler','Bryan Josh'),
('Welder','Joe Satriani'),
('Drummer','Nick Mason'),
('Drummer','Heather Findlay');
--
-- simulate the proc here
-- 
select @max = max(maxdups) from 
( select distinct JobName, COUNT(*) as maxdups from @Table group by JobName having COUNT(*) >1) as z;
declare @results table (JobID int not null,JobName varchar(50) not null);
while @max>=1
begin
MERGE INTO dbo.JobTitle AS wi  
      USING ( select distinct Jobname, COUNT(*) as HowMany from @Table group by JobName having COUNT(*) = @max) AS tbl  
            ON wi.JobName = tbl.JobName
      WHEN NOT MATCHED THEN -- if not there insert
         INSERT ( JobName,CreateDate)
            VALUES ( tbl.JobName,getdate())
         --OUTPUT   inserted.JobID,inserted.JobName into @results(JobID,JobName);
         when matched then
          UPDATE SET CreateDate = GETDATE()
        
         OUTPUT   inserted.JobID,inserted.JobName into @results(JobID,JobName);
    set @max-=1;
end
--
select * from @Table AS tbl join @results as RES on tbl.JobName = RES.JobName
--
MERGE INTO dbo.People AS wi  
      USING ( select tbl.PersonName,res.JobID from @Table AS tbl join @results as RES on tbl.JobName = RES.JobName ) as peop
            on wi.PersonName = peop.PersonName
WHEN NOT MATCHED THEN -- if not there insert
      INSERT ( PersonName,JobID,CreateDate)
            VALUES ( peop.PersonName,peop.JobID,getdate())
          OUTPUT   $action,inserted.*;
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }

Reasons to move to SQL 2008 - MERGE

One of the great things for me currently is that my client is x64 SQL 2008, which means I can use all those features that make SQL Server 2008 such a great product.

Anyway I have a very poor example of resolving a batch insert problem – you should realise that there are a number of restrictions and rules concerning MERGE but I’ll just pass these by for this example.

I have a number of records passed into a stored procedure as a table ( another great feature of sql 2008 ), these first have to check for existence in a sub table ( we’ll call it ) if one of the passed values exists in this table we must ignore the row, otherwise we have to insert the row and return the key for the inserted value to pass into a second table with the remainder of the data.

So what we’re doing doing is an insert into two tables based around an if exists for the first table. For a single row it’s not difficult to code but ifn you want to handle say 50 rows then we’re into temporary tables, loops – well it gets messy ( I’d never use triggers just in case someone thinks of using those!! )

Here’s my very poor working example, used with a proc you can create a table type and pass the table into the proc; but you can’t do this in  a straight query – well I got some very strange errors, but maybe it was the example I wrote!

( not quite sorted my publisher so lost the formatting – fixed next time )

--
-- create base tables and data
--
IF OBJECT_ID('dbo.Widgets', 'U') IS NOT NULL DROP TABLE dbo.Widgets;
create Table dbo.Widgets
( WidgetID int identity(1,1) not null,
WidgetName varchar(50) not null,
CreateDate datetime not null
);
Alter table dbo.Widgets add constraint PK_Widgets Primary Key Clustered (WidgetID);
alter table dbo.Widgets add constraint UK_Widgets_Name unique (WidgetName);
--
insert into dbo.Widgets(WidgetName,CreateDate)
values
('BlueMeerkat1',GETDATE()-1),
('BlueMeerkat2',GETDATE()-2),
('BlueMeerkat3',GETDATE()-3),
('BlueMeerkat4',GETDATE()-4),
('BlueMeerkat5',GETDATE()-2),
('BlueMeerkat6',GETDATE()-4),
('BlueMeerkat7',GETDATE()-3),
('BlueMeerkat8',GETDATE()-2);
--
IF OBJECT_ID('dbo.Wotsits', 'U') IS NOT NULL DROP TABLE dbo.Wotsits;
create table dbo.Wotsits
( WotsitID int identity(1,1) not null,
  WotsitName varchar(50) not null,
  WidgetID int not null,
  CreateDate datetime not null
);
Alter table dbo.Wotsits add constraint PK_Wotsits Primary Key Clustered (WotsitID);
alter table dbo.Wotsits add constraint UK_Wotsits_Name unique (WotsitName);
--
insert into dbo.Wotsits(WotsitName,WidgetID,CreateDate)
values
('YellowWotsit1',1,GETDATE()-1),
('YellowWotsit2',2,GETDATE()-2),
('OrangeWotsit1',3,GETDATE()-3),
('OrangeWotsit2',4,GETDATE()-4),
('CheeseWotsit1',5,GETDATE()-2),
('CheeseWotsit2',6,GETDATE()-4),
('OnionWotsit1',7,GETDATE()-3),
('OnionWotsit2',8,GETDATE()-2);
--
--
DECLARE @Table table( SeqKey INT identity(1,1) NOT NULL PRIMARY KEY,
  WidgetName varchar(50) not null,
  WotsitName varchar(50) not null);
--
INSERT INTO @Table(WidgetName,WotsitName)
VALUES
('BlueMeerkat3','LeekWotsit1'),
('BlueMeerkat9','LeekWotsit2'),
('BlueMeerkat3','LeekWotsit3'),
('BlueMeerkat4','LeekWotsit4'),
('BlueMeerkat10','LeekWotsit5'),
('PinkMeerkat6','LeekWotsit6'),
('PinkMeerkat4','LeekWotsit7'),
('PinkMeerkat5','LeekWotsit8');
--
-- simulate the proc here
--
declare @results table (WidgetID int not null,WidgetName varchar(50) not null);
--
MERGE INTO dbo.Widgets AS wi 
      USING @Table AS tbl 
            ON wi.WidgetName = tbl.WidgetName
WHEN NOT MATCHED THEN -- if not there insert
      INSERT ( WidgetName,CreateDate)
            VALUES ( tbl.WidgetName,getdate())
OUTPUT  -- return the value and id of the rows we did insert
            inserted.WidgetID,inserted.WidgetName into @results(WidgetID,WidgetName);
--
MERGE INTO dbo.Wotsits AS wo
USING @Table AS tbl join @results as RES on tbl.WidgetName = RES.WidgetName
  ON wo.WotsitName = tbl.WotsitName
WHEN NOT MATCHED THEN -- if not there insert the row
  INSERT (WotsitName, WidgetID, CreateDate)
  VALUES ( tbl.WotsitName, RES.WidgetID, getdate() )
OUTPUT -- output what we inserted in plain text
$action, inserted.WotsitID,inserted.WotsitName;
-- endproc

Technorati Tags:


Posted by GrumpyOldDBA with no comments

Does a GUID make a good clustered key?

I've covered the use of unique identifiers ( guids ) in my presentations and in my articles on anlysing indexes; however I probably wasn't quite as succinct as I could have been.

So do I like unique identifiers as clustered keys? Absolutely not unless there is no other choice.

Why? Well other than all guid implemetations I have seen do not use sequential guids it's a top step to making sure your your base table always fragments badly.
But I really dislike them when we get to consider large tables, where we have millions if not hundreds of millions of rows.
A guid is 16 bytes wide and the width of a clustered index is added to every row of a secondary index.
I cover the subject in depth in part 4 ( sorry it takes a while to load) http://www.grumpyolddba.co.uk/sql2005/sql2005.htm.

Anyway - Consider a worst case scenario where we have an secondary index index on a nullable tinyint where the clustered key/index is a guid.
How big is your index per row?
Well a  tinyint is 1 byte
Each row of the index will actually be 16 bytes + 1 byte overhead + 1 byte for the tinyint + 3 bytes for being a nullable column
Actual size 21 bytes per row for a 1 byte data type.
Using an int for the clustered key will remove 12 bytes per row for the index
removing the null will remove a further 3 bytes.
But just using an int in place of a guid will save you 115 MB on every secondary index on a 10 million row table.
On a billion row table this equates to 11 GB and if you factor in the null this would save in total 14 GB per index, as most tables I  encounter have many indexes you could be really making a real meal of it.

Now you may say who has tables that big, well a few years ago I was facing 400 million row tables and my current project is likely to hit a billion rows or more in some tables. I don't have any guids as you might expect.

Now the other downside of guids is that your foreign keys also become guids so a simple index on a key + one FK will come in at 33 bytes per row. Now indexes take up buffer cache and as databases beome larger then storing bloated indexes in cache is best avoided. ( You could of course use compressed keys to save space, and I have a large post coming on that as part of the "Can I index a bit" )

I'm having to exist on mobile internet as my broadband is being switched so it's made posting slightly more tricky

NTFS Fragmentation may be more harmful than you think

I’ve posted previously about the adverse affect of ntfs fragmentation and I’ve also mentioned that it is sometimes suggested that fragmentation is not a concern on a SAN.

I’ve yet to finish testing this on a SAN, after all you don’t often get a free play with a SAN for this type of testing, but I did find that extreme fragmentation of the mdf files on a SAN did degrade performance .. sadly I didn’t have the time or resource available to complete my tests.

However, I see that  CSS SQL Server Engineers ( Microsoft Customer Service and Support (CSS) SQL Support ) have put out quite an extensive post which covers the subject of ntfs fragmentation and SQL Server.

http://blogs.msdn.com/psssql/archive/2009/03/04/sparse-file-errors-1450-or-665-due-to-file-fragmentation-fixes-and-workarounds.aspx

What is interesting is that a KB which only lists vista has been marked as applying to W2K3 and Windows 2008  http://support.microsoft.com/kb/967351

I suggest having a good read of this if you don’t normally pick up posts from this blog.

Thin Provisioning and ntfs

I'm currently working on a SQL Server storage project and as part of that I'm touching a number of areas that may be of interest to the DBA.

The IT industry is full of buzz words, new fads, fashionable terms and so on; and Thin Provisioning was something I was aware of but not fully informed, not that I am now mind. Anyway sad to admit amongst the publications I read is Storage Magazine  http://www.storagemagazine.co.uk/  I like to keep abreast of the latest trends which is why I also buy MSDN magazine although I can't claim to understand a great deal of it, but at least I might get an insight into what's hot or not for developers with regard to anything which might touch SQL Server.

Anyway amongst the reading material was a comments about Thin Provisioning not really working with ntfs; now thin provisioning allows you to allocate storage you don't have but trick the applications into thinking you have. The theory is that shoudl you maybe need 1 TB for a database but currently it's only using 100GB you provision the 1TB but onlygive 100GB. The application thinks it has 1TB and as storage use increases the storage pool automatically increases the actual storage. Now doesn't that sound very much like what our banks were doing with their lending?

Anyway you might like to read a general article here http://searchstorage.techtarget.com/news/article/0,289142,sid5_gci1188117,00.html  and I also found a really good windows article by a Microsoft MVP, Dilip Naik, http://msftmvp.com/default.aspx   his article on thin provisioning is here  http://msftmvp.com/Documents/ThinNTFSv2.pdf

Now as many people know I'm fairly cynical about SAN storage and the apparent lack of performance, however I'm not going to make any comment about this subject matter - I suggest you read carefully and then be sure to ask the right questions if your storage provider has this facility. Some storage vendors have  different terms for it as you might guess!

Here's an article which describes Thin Provisioning in more detail http://searchstorage.techtarget.com/news/column/0,294698,sid5_gci1134713,00.html?mboxSession=1235686994531-668391&mboxSession=1235686994531-668391#

The links to searchstorage are somewhat messy with lots of ads and popups but if you can get through the rubbish there are some good articles. I should mention that Dilip has an excellent white paper on using SSDs too and Joe Chang  http://sqlblog.com/blogs/joe_chang/default.aspx     is also promising some research with SSDs and SQL Server. I'm planning to buy another 4 SSDs but I'm not sure I'll ahve enough for a meaningful test.

 

Posted by GrumpyOldDBA with no comments
Filed under: , ,

Can I Index a bit ? ( Part 1 )

The final part wil follow and a fully illustrated pdf will be on my website - I'll post the location when it's complete.

 

Can I index a bit?

VBUG  Presentation 27th January 2009 Bracknell.

 

Here's a brief summary of subject matter and the points raised.

 

Q.   Is indexing a black art: will zen help: will the queries work better with a full moon?

A.   Yes!

 

The presentation intends to show practical examples of how things are rather than how things should be.

The reality is that few databases match classroom best practices, this presentation is  about what actually works ( sometimes ) based upon real world  experience,  as to will these examples always be true, well,  to coin an oft used phrase  " It just depends "

 

NB. I struggle to sometimes make large blog posts which include pictures and code samples, I will illustrate and code a pdf version of this on my website www.grumpyolddba.co.uk

 

Bad Table:

  • This was the introduction where I showed a table with a wide clustered primary key and all other columns allowing nulls. I asked what, as a DBA, I might not like about this table, now read on.

 

Wide clustered indexes:

  • Bad because the width of the clustered index is added to every row of every secondary index.
  • If you need a wide Primary Key then make it non clustered and cluster on an integer if possible that makes up your PK.
  • For that reason GUIDs make bad clustered keys as they are 16 bytes wide compared to an int at 4 bytes.
  • If you must use a GUID then choose sequential guids as this will reduce table fragmentation.
  • e.g. You may calculate that a secondary index defined on an integer column would fit approximately 8,000/4 rows per page, 2,000 rows per page.
  • o However, if your clustered key consists a guid then each entry for each row for that index is actually 16+4+1 bytes, this actually gives approx 381 rows per page
  • o For an integer clustered key the capacity would be approx 890 rows per page. In real terms this is 11.8Mb difference for an integer index on a 1 million row table, and that's for each secondary index.
  • o This also equates to buffer storage for secondary indexes and page reads required for seeks and scans.

 

The Truth about clustered indexes:

  • Traditionally the clustered index has been marked for range scans, typically dates and so on.
  • What is often overlooked is that the clustered index is the entire table.
  • Showing a select of a range of integer values from an integer clustered primary key showed that despite the result set only being an integer the entire row for each value was placed in buffer cache.
  • o ( shown by the number of page reads )
  • The solution was to create a non clustered index which duplicated the clustered index, the same query now used the secondary index and in the example shown reduced page reads from 175 to 5.
  • The point being made here is that if the clustered primary key is being used in a query, say a six table join, every clustered index seek or scan will have returned the entire row regardless of how many columns were required to satisfy the query, this can be a big hit in terms of io and usage of the buffer cache.

 

A clustered index seek in a query plan is good:

  • Well actually "it just depends".
  • Depending upon the where clause a seek may actually be a scan.
  • I returned 1,999,999 integer values from the clustered key from a 2 million row table, this is shown quite clearly as a seek.
  • o ( using a where clause of < 2000000 )
  • I then returned all 2 million rows
  • o ( using a where clause of <= 2000000 )
  • The point here was that when looking at query plans don't just assume a seek is automatically good, you must also consider how many rows the seek is returning.

 

What should the Primary/Clustered key be:

  • I said that I was leaning towards the use of a sequential numeric key, a surrogate key if needed.
  • I tended to only see the clustered key as a means to defragging the table and of having limited use in data retrieval queries.
  • o Well how many single column( key) queries do you do on a clustered PK?
  • o How often do you actually want the entire row of every table in say a 5 table join.
  • Did I think the Primary key should be part of the real data?
  • Yes this is valid, but not always the clustered key.

 

The curse of the null:

  • This one is easy.
  • I ran a query in the background against a 400 million row table.
  • A nullable column makes every row in an index 3 bytes bigger than a non null column.
  • o This is 3 bytes per column, index 4 nullable columns you get 12 bytes per row extra
  • The 400 million table results showed that a single integer index was 1.14GB larger for the nullable column.
  • That's a big difference in storage, io, buffer usage, backups
  • With an integer clustered key the record size per row for an integer secondary index are 12 bytes for a nullable column and 9 bytes for a non null.
  • o In page allocation this is 896 rows per page vs 671 rows per page
  • Finally a table which has all nullable columns accepts an entry of nil data ( except the PK ) not a particularly good design concept.

 

Talking of nulls:

  • How do you deal with multiple nulls where the data must be unique?
  • For UK readers we have a National Insurance number allocated to us.
  • This number is unique and is a key part of data collected by employers, so a HR database which stores NI numbers must make sure the stored numbers ( which actually are alphanumeric ) are always unique.
  • But life isn't always like that and new employees might not immediately know their NI number so you store "Unknown" or null. Trouble is that you may only have one null or unknown in your column.
  • Prior to SQL 2008 you could solve this by using an indexed view defined on the NI number.

 

 

create view dbo.vw_People_NiNumber

with schemabinding

as

select NiNumber  from dbo.People where NINumber is not null;

--

-- now index the view

--

create unique clustered index idx_vw_People_NiNumber on dbo.vw_People_NiNumber(NiNumber);

--

-- show it has materialised

--

exec dbo.sp_spaceused  'dbo.vw_People_NiNumber';

 

 

  • This will allow you to have multiple nulls but the stored data will be unique
  • SQL 2008 provides filtered indexes to resolve this, below.

 

-- now in sql 2008

--

create unique index uk_People_NiNumber on dbo.People(NiNumber) where NiNumber is not null;

 

 

How do you create an index that spans more than one table?

  • Now there's a handy thought if I ever heard one, however, in SQL Server an index can only include columns from a single table.
  • Well that's actually not strictly true, you can do this with an indexed view.

Indexed Views to replace joins:

  • Indexed views can be useful, typically they may be used to store aggregated data.
  • As a tuning aid I will consider using an indexed view to replace tables in a join.
  • To explain, the view is used ONLY to satisfy part or all of a joined query, I don't call the view directly or wish to do so.
  • The optimiser is clever enough to recognise if an indexed view "covers" part of a query and will use it replacing the original tables.
  • This is a three table join

 

-- query to show additional data for People

--

select p.*,sp.name,cr.name

from dbo.People p join dbo.StateProvice sp on p.StateProvinceID = sp.StateProvinceID

join dbo.CountryRegion cr on cr.CountryRegionCode = sp.CountryRegionCode;

--

 

 

  • The view replaces two of the tables and the join ( StateProvince and CountryRegion )
  • When the query runs again the optimiser will use the indexed view instead of the two tables
  • ( this is better viewed on http://www.grumpyolddba.co.uk/ )

 

 

-- create view

--

create view dbo.vw_RegionStuff

with schemabinding

as

select sp.StateProvinceID, cr.CountryRegionCode,    sp.name as Province ,cr.name as Region

from dbo.StateProvice sp

join dbo.CountryRegion cr on cr.CountryRegionCode = sp.CountryRegionCode;

go

--

-- create index

--

create  unique clustered index  cix_RegionStuff on vw_RegionStuff(StateProvinceID,CountryRegionCode,Province,Region);

go

 

 

 

Posted by GrumpyOldDBA with 1 comment(s)
Filed under: , ,

Tracking contention on the SAN - testing Times

 
One hotly disputed topic on SAN performance is the matter of contention.This may materialise in the fabric due to over subscription or excessive fan in. A fibre channel fabric is a network which connects servers and storage through switches, just like an Ethernet network connects us to the internet and to each other in the office.

Ultimately there will be less ports on the actual storage then there are servers and this is where problems may arise.

 
 basic image of a core edge networkpicture                       

 

 

 

 

 

 

 

 

 

 

 

 

 

 

·         This is a highly simplified drawing illustrating the principles of a core edge network. There are 10 servers each with let us say 2 x 4GB HBAs. Network speeds always take the lowest common denominator so the two connections into the storage will also be at 4GB.  ( This is an over simplification )

·         Essentially there’s more potential bandwidth than supported by the storage;  best example I can think of might be the security scans at an airport, 20 check in desks but only 2 body scanners. ·         An area of contention can be the actual disks or luns, there are different ways of configuring SAN storage, luns may be mapped to physical sets of disks or mapped virtually to an unspecified number of spindles. ( We won’t go into this here ). If you share disks/luns there is always the possibility of contention. You can try this at home if you’ve created more than one partition on your laptop or PC harddrive. Just run a job or task, then run repeat but whilst copying a large file(s) to the other partition. ·         It is possible that the backplane bandwidth may not be sufficient, most storage has a number of ports, these might be allocated as 8GB, 4GB and 1GB and may relates to types of disks or certain trays, 8 x 8GB ports suggests 64GB bandwidth, however this may not be the case – check with your friendly vendor. ·         Finally the HBAs and switches may suffer due to buffering, SQL Server tends to push out lots of small io, this is different from a fileserver and there may be latency within the HBA and/or switches. ·         So how can you tell? Well I’ve been testing by running a job at regular intervals, it’s up to you what you choose but ideally it should be able to provide consistent run times and results, should be repeatable and ideally portable. It should, in my view, be a sql server test using sql code.  ·         Here’s what you might see as results from an hourly test, Red designates failure due to an overrun of time.·         The results are hypothetical, but one might deduce that 18:00 shows users going home, the red areas may indicate this when backups occur, the 13:00 run is better, lunchtime maybe?·         If your application is 7 x 24 then this type of result is bad news. ·         Bear in mind this is from a test so the results are not affected by lunch breaks or going home, but most likely the SAN is.·         Some typical contention I have encountered over the years has come from Exchange, this manifested as increased io latency with no change in application load, yet another subject area!  
  
Date – start time Duration( h:m:s )
12/01/2009 08:05 00:34:35
12/01/2009 07:05 00:36:32
12/01/2009 06:05 00:33:42
12/01/2009 05:05 00:44:58
12/01/2009 04:05 00:37:19
12/01/2009 03:05 00:43:32
12/01/2009 02:05 00:54:59
12/01/2009 01:05 00:55:00
12/01/2009 00:05 00:54:59
11/01/2009 23:05 00:54:59
11/01/2009 22:05 00:54:59
11/01/2009 21:05 00:54:59
11/01/2009 20:05 00:46:09
11/01/2009 19:05 00:34:32
11/01/2009 18:05 00:23:43
11/01/2009 17:05 00:27:58
11/01/2009 16:05 00:28:39
11/01/2009 15:05 00:34:25
11/01/2009 14:05 00:32:45
11/01/2009 13:05 00:29:47
11/01/2009 12:05 00:33:36
11/01/2009 11:05 00:29:41
11/01/2009 10:05 00:33:51
11/01/2009 09:05 00:36:48
11/01/2009 08:05 00:31:22
            
Posted by GrumpyOldDBA with 1 comment(s)
Filed under: , ,

SQL Server 2008 Information

I was searching for a particular document for SQL 2008 and thought I'd list the links to what I found, there's some especially good white papers in Technet, I can recommend the T-SQL enhancements and the indexed view white papers especially:

SQL Server 2008 White Papers         http://www.microsoft.com/sqlserver/2008/en/us/white-papers.aspx

Technet SQL 2008 White Papers      http://technet.microsoft.com/en-us/library/bb418496.aspx

A list of blogs and other sites, sadly no Grumpy Old DBA   http://msdn.microsoft.com/en-us/sqlserver/bb671054.aspx

 

 

Posted by GrumpyOldDBA with no comments
Filed under: ,

Testing Times - mdf fragmentation

Now I've never been convinced that file level fragmentation is irrelevant on a SAN, at least two sources have assured me that this is the case, but then they also assured me of a number of other points most of which it appears I have proved to be basically incorrect. < grin >

Now I've followed with great interest a series of posts by Linchi Shea  http://sqlblog.com/blogs/linchi_shea/ concerning SAN fragmentation, if you don't subscribe to this blog then you're really missing a great source of technical knowledge, without this blog I would have struggled ( well I still struggled but that's another story ) with trying to get the storage teams to consider HBA queue depth for a start.

So I decided to come up with the insane fragmentation test: Assume that there is no DBA and you're creating a number of databases on your server; here's what I did;

I created three databases and then ran my test 1,  which creates and populates a 1 million row table of approx 8.5GB, in each database simultaneously. The population of the databases caused auto growth and true to form windows + sql server managed to totally fragment all three databases despite ample free space on the LUN.

On a serious note this should be taken as a warning as to what can happen if you don't manage database growth. Here's the output showing my fragmented databases

 Fragmentation insanityThe three databases are called Stresful, rubbish1 and rubbish2.

 

 

 

 

 

 

 

 

 

 

 

 

 

  • I then emptied the datafiles of data and then ran my scripts which populated the three databases with 1 million rows of data each, I did this three times.
  • I dropped all the databases, created non fragmented databases and repeated the test three times.
  • The three runs against the fragmented databases took on average 106 mins.
  • The three runs on non fragmented databases took on average 16 mins

It's not for me to name the particular SAN that I used for this test but it is a serious enterprise bit of kit and is claimed to be able to scale with every condition known to man and maybe a few more!  To the best of my knowledge the HBAs are 4GB, there's more than 1 of course because this test was run on a cluster.

  • Logically as this was all data inserts I knew the performance would be bad and I intend to repeat the test with a smaller number of fragments and to test updates as well as inserts, see my other posts.
  • We don't always have just one database on our servers and we don't always have a sperate lun for each mdf file, much as some of my tests showed that creating multiple files ( not filegroups ) for your database could degrade performance the movements of the disk heads I would suggest are the factor here.

Linchi only did a test on single files, maybe running multiple tests would produce a different result.

Maybe the particular SAN I'm testing against is different, sadly like probably most DBAs I don't have acess to multiple hardware platforms, it's only due to a data centre migration I've been able to run these tests, I'm attempting to show that the migration does not bring any degredation of performance compared to our current data centre.

I should give a quick few words of thanks to Tony Rogerson http://sqlblogcasts.com/blogs/tonyrogerson/  who has fended a number of my questions and who also run SAN benchmarks  http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/09/22/1089.aspx 

I hope this post formats correctly - always a bit of a gamble when adding images!

 

 

Posted by GrumpyOldDBA with no comments
Filed under: ,

CompareStorage@GrumpyOldDBA

As a slightly seasonal touch of lightness I though I’d just do a comparison of DAS and SAN storage

 

Item

DAS SAN

Basic Storage Cost

Good

Hideously expensive

Ease of Use

Simple

How many storage engineers and system administrators does it take to change a lun ?

Performance Monitoring

Easy

Almost Impossible

Ability to modify configuration

Easy

Most “ automatic/dynamic “ so largely not.

Actual Performance

Predictable

Who knows

Expandability

Pretty simple

Expensive e.g. how much for a tray of disks ????????

Redundancy

Dual controller cards, dual cabling, multiple power supplies, hot spare disks – pretty good.

Allegedly very good.

  

At a recent user group meeting I posed the question about DAS vs SAN performance and were DBAs generally very satisfied with SAN storage. The overall impression I got was that generally us DBAs don’t feel we get value from the tin and that the storage guys are not very keen to talk to us.

 

A disk is a disk is a disk so I’m often slightly perplexed that when I test I get such differing results, it’s pretty obvious to me that there is contention on most sans, and here I mean the whole Storage Area Network not just the actual storage.

 

I’m still waiting for a rational explanation as to why a handful of internal SAS disks can consistently outperform a couple of very expensive SANs over a wide range of different tests including application code.

( The exception here being ioMeter which it seems you can configure to prove one set of storage runs faster than another when tested with ioMeter when every other test proves ( ? ) otherwise – I’m still attempting to get my head around this one ) 

As ultimately the systems I tend to support all have a requirement for performance I find it increasingly difficult to justify the use of a SAN with SQL Server.

 

As a byline I’ve just replaced the hard drive in one of my laptops with a SSD , http://www.offtek.co.uk/item_detail.php?id=575331&maincat=1&subcat=99

 

It’s not the fastest SSD in the world but it does appear to make the whole laptop so much quicker, certainly startup of XP is pretty quick, as laptop disks are so slow this is quite a good way to breathe new life into your laptop.

I’m still thinking about SSD for my server !!

 

Posted by GrumpyOldDBA with 1 comment(s)
Filed under: ,
More Posts Next page »