SQLBits X : NUMA Internals Session
22 January 12 09:13 PM | GavinPayneUK | with no comments

For various (good and positive) reasons I never seem to get the time to write blog articles these days, however with the authoring of my next SQLBits session now on my todo list I thought I’d take a moment to explain the background behind it and what its likely to contain.

What is NUMA?

NUMA stands for Non-Uniform Memory Architecture and is a server hardware architecture which any modern server with more than 1 CPU will use.  In two sentences, to make servers scale better each CPU in a server is now only directly connected to specific pieces of the server’s physical memory.  If a CPU needs to access memory which it isn’t directly connected to it needs to go via a memory controllers, this adds latency to the memory request so in an ideal world modern software should do its best to avoid ever needing to read “foreign memory”.  There’s also something called Soft-NUMA which can be misleadingly positioned as being similar to Hard-NUMA, I’ll talk about that more in my session.

What’s this got to do with SQL Server?

Today, SQL Server is regularly deployed on servers with several CPUs and large amounts of memory and as a consequence both SQL Server and Windows are NUMA aware.  They each do their part in trying to ensure these expensive “foreign” memory requests are kept to a minimum, Windows at the CPU instruction level, SQL Server at the SQLOS scheduler level.  

Curiousness develops

You can’t influence how SQL Server works in the NUMA world, it all happens deep within the internals, but you can see NUMA “in action”.  A lot of information about how SQL Sever is internally load balancing its workload and caches between the server’s NUMA nodes is exposed in the DMVs.  And, if you’re anything like me then when you start looking into these DMVs you’ll be curious about what’s actually happening internally.  How do parallel queries run across multiple schedulers when there is a “buffer pool” per NUMA node, why does all the memory seem to be assigned to one NUMA etc etc.

Friday session

It was these questions which made me curious enough to research the topic for myself earlier this year and I hope for those who come to my session on the Friday in March you’ll leave equally as impressed as I was about how SQL Server “floats” across your server’s motherboard.

SQLBits X : NUMA Internals Session
22 January 12 09:11 PM | GavinPayneUK | with no comments

For various (good and positive) reasons I never seem to get the time to write blog articles these days, however with the authoring of my next SQLBits session now on my todo list I thought I’d take a moment to explain the background behind it and what its likely to contain.

What is NUMA?

NUMA stands for Non-Uniform Memory Architecture and is a server hardware architecture which any modern server with more than 1 CPU will use.  In two sentences, to make servers scale better each CPU in a server is now only directly connected to specific pieces of the server’s physical memory.  If a CPU needs to access memory which it isn’t directly connected to it needs to go via a memory controllers, this adds latency to the memory request so in an ideal world modern software should do its best to avoid ever needing to read “foreign memory”.  There’s also something called Soft-NUMA which can be misleadingly positioned as being similar to Hard-NUMA, I’ll talk about that more in my session.

What’s this got to do with SQL Server?

Today, SQL Server is regularly deployed on servers with several CPUs and large amounts of memory and as a consequence both SQL Server and Windows are NUMA aware.  They each do their part in trying to ensure these expensive “foreign” memory requests are kept to a minimum, Windows at the CPU instruction level, SQL Server at the SQLOS scheduler level.  

Curiousness develops

You can’t influence how SQL Server works in the NUMA world, it all happens deep within the internals, but you can see NUMA “in action”.  A lot of information about how SQL Sever is internally load balancing its workload and caches between the server’s NUMA nodes is exposed in the DMVs.  And, if you’re anything like me then when you start looking into these DMVs you’ll be curious about what’s actually happening internally.  How do parallel queries run across multiple schedulers when there is a “buffer pool” per NUMA node, why does all the memory seem to be assigned to one NUMA etc etc.

Friday session

It was these questions which made me curious enough to research the topic for myself earlier this year and I hope for those who come to my session on the Friday in March you’ll leave equally as impressed as I was about how SQL Server “floats” across your server’s motherboard.

Through the Virtual Microscope–SQL Server and Hyper-V
11 November 11 06:53 PM | GavinPayneUK | 6 comment(s)

(Version 1.1!)

In my recent SQLBits conference session, Through the Virtual Looking Glass available to watch here, I spoke about monitoring SQL Server in a virtualised environment.  We looked at good and bad contention, where resource pressures which can adversely affect SQL Server might come from and most importantly how we can monitor the environment to detect them.

Since then, I’ve been in discussions with some of the Microsoft product team who are focussed on optimising Windows 8 Hyper-V and SQL Server 2012, specifically Guy Bowerman and Serdar Sutay.  We’ve been talking about some of the challenges SQL Server professionals have today working in virtualised environments and how we could resolve them through future product functionality.  An example of a recent change they’ve engineered into the product has been the support for hot-add memory support in the standard edition of SQL Server 2012 when running in a virtual environment, still Enterprise Edition only for physical servers.  This will allow lower end users of SQL Server to still benefit from the Dynamic Memory feature of Windows Server 2008R2 service pack 1.

In both my presentation and conversations with Microsoft I’ve been discussing the importance of knowing as much as we can about the physical host server, its utilisation, the hypervisor’s configuration as well as what’s happening inside the virtual server – the SQL Server component.  To help, Serdar kindly sent me some information about Hyper-V information exposed inside the virtual machine’s registry.

As a result, I’ve written a Powershell script for DBAs which gives a bit more of an insight about the Hyper-V environment SQL Server is running within.  The script can be found at the bottom of this blog article which gives the following information:

  • The hostname of the virtual server the script is running on – for reference purposes.
  • The hostname of the physical host server the script is running on – this is useful to be able to tell if the virtual server has been Live Migrated to another physical host server without the DBA knowing.  Why?  Well this maybe an issue if the new physical host server is of a much lower spec than the one it was on previously; if SQL Server jobs start running more slowly than they did yesterday then the problem might have nothing to do with SQL Server at all.  Instead, the new host server might be slower or more highly contended.
  • A check to see whether or not its possible for Hyper-‘s Dynamic Memory feature to be used on the current host server.  While we may not be able to tell as easily whether  Dynamic Memory is actually configured for use, knowing whether or not it could be will at least help with memory related performance troubleshooting.
  • Finally, the last two checks give information about the physical CPUs in the host server.  This information may not directly be relevant to the performance of the virtual server as CPU resource maybe restricted or contended, but it is a good way to help identify whether you’re using the same physical server as you were yesterday.  If your virtual  server has been moved at least you can tell whether you should be getting the same CPU performance as you were previously.

It’s a fairly basic script which I’m sure could fail badly due to its lack of error handling, however it worked as expected from a Powershell session.

The script is below, apologies if the formatting is slightly odd:

#Get Hyper-V environment data from the registry
$Path = 'HKLM:\SOFTWARE\Microsoft\Virtual Machine\Guest\Parameters'
$Key = Get-Item -LiteralPath $path

$PhysicalHostname = $Key.GetValue('PhysicalHostName')
$VirtualHostname  = $Key.GetValue('VirtualMachineName')
$OsMajor          = $Key.GetValue('HostingSystemOsMajor')
$OsMinor          = $Key.GetValue('HostingSystemOsMinor')
$ServPack         = $Key.GetValue('HostingSystemSpMajor')

if (($OSMajor -ge '6') -and ($OSMinor -ge '1') -and ($ServPack -ge '1'))
{ $DynamicMemory = 'Supported'}
else
{ $DynamicMemory = 'Not supported'}

#Get physical CPU information
$CPUName = (Get-WMIObject Win32_Processor).Name
$CPUMHz  = (Get-WMIObject Win32_Processor).MaxClockSpeed

write-host 'The hostname of this virtual machine is               :' $VirtualHostname
write-host 'The hostname of the physical host server is           :' $PhysicalHostname
write-host ''

write-host 'Hyper-V Dynamic Memory support on this host server is :' $DynamicMemory
write-host ''

write-host 'The CPU type in the physical host server is           :' $CPUName
write-host 'The CPU speed in the physical host server is          :' $CPUMHz 'MHz'

Through the Virtual Looking Glass
06 October 11 09:24 AM | GavinPayneUK | with no comments

Many thanks to all those who helped produce and manage the SQLBits 9 conference in Liverpool this weekend and to everyone who attended.  Coeo as a team enjoyed the long weekend’s events and I’m especially grateful to all those who attended my session “Through the Virtual Looking Glass – Monitoring Virtualised SQL Servers”.  Or virtualized, depending on your choice of spelling.

The purpose of this session was to explain why the approach you take to monitoring a virtualised instance of SQL Server can be massively different to how you would monitor a traditional physical server running SQL Server, along with some supporting theory about how hypervisors work.

The slides from my session can be found here, with the dynamic memory checking T-SQL script I used below.

In summary, there are three areas to consider:

CPU – a big change if you’re interested in how much physical CPU utilisation your server is using, rather than logical CPU utilisation.

Memory – a big change if you're using a hypervisor which supports Dynamic Memory or your vendor’s equivalent.  “Your physical memory may go down as well as up”.

Storage – no significant changes, although contention points such as HBAs, NICs and storage controllers should be reviewed.

Monitoring SQL Server’s Memory in a Dynamic Memory world

During my session I demo’d how Hyper-V reacts to SQL Server wanting more memory and how SQL Server in turn reacts to that memory becoming available. Explaining how that works is best left to the SQLCat team who have an excellent whitepaper here.

In the meantime, below is the query I used in my memory demo, this was written for Denali CTP3, the original buffer pool related column names can be found in sys.dm_os_sys_info for SQL Server 2008:

select
(m.total_physical_memory_kb /1024) as 'Physical Memory (MB)',
convert(decimal(3,1),(convert(decimal(10,1),m.available_physical_memory_kb / 1024)
/       
convert(decimal(10,1),m.total_physical_memory_kb / 1024))
* 100) as 'Available Memory as % of Physical Memory'

from sys.dm_os_sys_memory m, sys.dm_os_sys_info i


select
convert(decimal(10,1),(convert(decimal(10,1),i.committed_target_kb / 1024)
/       
convert(decimal(10,1),m.total_physical_memory_kb / 1024))
* 100) as 'Committed Target as % of Physical Memory',
convert(decimal(3,1),(convert(decimal(10,1),i.committed_kb  / 1024)
/       
convert(decimal(10,1),m.total_physical_memory_kb / 1024))
* 100) as 'Total Committed as % of Physical Memory'
           
from sys.dm_os_sys_memory m, sys.dm_os_sys_info i

Making SQLNexus work smarter with Enterprise Edition
19 September 11 09:31 AM | GavinPayneUK | with no comments

For those not familiar with it, SQLNexus is a free graphical analysis tool from Microsoft which analyses SQL Server traces captured using the perfstats set of scripts.  They’re all free and available from Codeplex (link) and most importantly, use diagnostic components built into the SQL Server installation.  As a SQL Server consultant Perfstats and SQLNexus are invaluable to me when I want to find the most expensive queries executed on a system, by execution count, IO or CPU time, and look at the execution plans which were being used.

The SQLNexus application uses a SQL Server database into which it initially imports the trace files and stores aggregated data about every statement captured in the trace files.  The application then uses a reporting control to graphically show summaries of the imported data. 

Page Compression

Anyway, the purpose of my post was about something I found last week which positively surprised me.  SQLNexus not only detects which version of SQL Server it’s using for its data store, but also which edition.  I know this as by accident I found 65 tables in a SQLNexus database which have page compression enabled.  Page compression in case some are unaware is only available in the Enterprise, Developer and Evaluation editions of SQL Server 2008 and R2.  The benefit of this feature comes by turning an initial expensive CPU task of compressing the data into subsequently physically smaller amounts of data to write and then read from physical disk.  When you consider how large some SQLNexus databases get, some of mine are 8GB+, the IO saving page compression brings is a big performance boost when analysing the data.

In the database I found with it enabled the space saving wasn’t amazing, approximately 20%, but I only had 8498 rows.  In previous databases I’ve had SQLNexus create I’ve seen over a million rows, so knowing there’s some compression optimisation happening in the background is handy.

Mending the master database
19 August 11 02:11 PM | GavinPayneUK | with no comments

Having a corrupt master database in SQL Server could be said to be one of the worst issues a database server instance could ever have.

What makes protecting against master database corruption more difficult is that it can’t have any of the regular high-availability features available in SQL Server used with it, such as database mirroring, replication or log shipping.  The only protection available is to back the database up regularly.

Backing up the master database

How you back that database up depends on your team’s skillset.  While SQL Server professionals would probably prefer to use native T-SQL commands and flat files if your infrastructure team only know about Backup Exec and would be responsible for restores at 2am then the decision might be made for you.

Below are links to the two backup methods mentioned above:

Ola’s T-SQL housekeeping and backup scripts here
Symantec’s Backup Exec SQL Server agent here

Restoring the master database using T-SQL

If your SQL Server master database is consistent but has invalid configuration settings then restoring it is fairly straight forward.

Using T-SQL and a flat file backup of the database requires SQL Server to be re-started in single user mode and the database restored, instructions can be found here.

If your SQL Server master database is corrupt to the extent that SQL Server won’t start in single-user mode then you have to rebuild the master database using the original setup tool.  A PSS blog article can be found here on how to do that.  If you’re rebuilding master on a SQL Server 2005 cluster expect to type one of the longest command line commands known to man, more details here.

Restoring the master database using Backup Exec

If you’re using a tool like Backup Exec, then restoring the master database becomes more straight forward according to Symantec’s web support.  It has a clever file system copying method for making a backup of master, meaning if you ever want to restore it whether because of corruption or invalid settings all you apparently have to do is stop SQL Server and rename its copies of the master mdf and ldf files to replace those you no longer want.  You then follow some post-rename steps once SQL Server is running again. 

An article about this clever feature can be found here.

Recovery not backup

As with all business critical systems how you back your systems up isn’t what matters, its how you test those backups and know how to use them in the event of a recovery which matters.  So don’t believe any of what I’ve written until you’ve tested it for yourself!

Clustered Floppy Disks?
02 August 11 10:19 PM | GavinPayneUK | with no comments

Many years ago we used to use floppy disks, hundreds of the things, each normally holding 1.44MB, and they were all we had to copy data around on.

Fast-forward to 2011 and its been a long time since I saw a PC with a 3.5” floppy drive fitted, let alone have someone send me data on one, probably about 11 years is my guess.  So, what a surprise when I was building a cluster with Windows Server 2008 R2 today when I got the following error message:

The wrong diskette is in the drive
Insert %2 (Volume Serial Number: %3) into drive %1.

I got this error on the "List potential cluster disks” section of the Cluster Verification Report (CVR) when I was preparing Windows Failover Clustering. 

Some background.  Both servers in the soon-to-be cluster had several LUNs attached to them using the SAN vendor’s MPIO and Fibre Channel HBA drivers.  Each LUN was presented as a disk to Windows, from where Windows had initialized the disk and put a GPT on it as a Simple disk.  Each disk had a single partition created and formatted as NTFS.  All very normal.

The issue was escalated to Microsoft and the SAN vendor where the focus of investigation was on the SAN and LUN configuration.

The initial consideration was the “Host operating system” value assigned to each of the hosts within the SAN controllers configuration; was this set to Window Server 2008/with Clustering?  Interestingly, while recommended, apparently the actual value makes no difference to the storage’s underlying functionality, but was still “highly recommended”. 

Ultimately, the issue was resolved by clearing the Persistent Reservations for the LUNs assigned to the cluster nodes at the SAN controller level.  I borrow from Symantec the following definition for those not familiar with the term:

“SCSI-3 Persistent Reservations (SCSI-3 PR) are required for I/O fencing and resolve the issues of using SCSI reservations in a clustered SAN environment. SCSI-3 PR enables access for multiple nodes to a device and simultaneously blocks access for other nodes.”  The full link is available here

Essentially, PRs are like write locks on a file, they allow at a storage communications level for a specific host to reserve full control over a storage item, while their peers for the time being can only achieve a lower level of access to the same storage item.

Clearing the Persistent Reservations allowed the previously failing CVR to succeed from where I could then build the cluster knowing Microsoft would happily support it.

So in the end, the actual error had nothing to do with floppy disks.  Instead, as a Google search will show, the Win32 API used and the error message raised date back to when that was the most likely cause of the failure obviously being trapped.

Time for new licensing models?
24 July 11 10:41 PM | GavinPayneUK | with no comments

For several years now we’ve become accustomed in medium and large environment to using the per-Processor licensing models. The main driver for this has been that in the Internet environment counting, let alone identifying, our end users has become almost impossible so being able to assign each of them a per-User licence has become almost impossible.

In the SQL Server world the per-Processor model gave us an “all you can eat” option whereby as long as the physical CPUs were adequately licensed you could have as many CPU cores, as much server memory, or as many end users as you like, and well as various number of instances of SQL Server depending on your edition.

All you can eat

Per-processor licences are expensive, there’s no way of suggesting they’re not, but these days the amount of server resource we can use with a single processor licence is massive compared to what it was just 2 or 3 years ago; I regularly work with servers now with 24 or 32 logical CPUs and 64GB of memory and these are considered not quite but almost entry level servers. With Intel and AMD’s CPUs now coming with 6, 8 or 10 cores, plus Hyper-Threading, that’s quite a few logical CPUs.

Microsoft are different to some of their competitors in that they only require us to licence the physical sockets in use, whereas other database vendors are interested in the number of cores that are in use too, often pushing the licensing price up a lot.

VMware’s new licensing model

VMware, the virtualisation software company, probably has its software deployed on servers near to or equally as large as SQL Server does, and almost certainly in greater numbers. As a result their software previously did have CPU core limits, although to be fair the server hardware at the time meant these limits weren’t often reached.

In recent weeks VMware have released version 5 of their premier virtualisation platform vSphere. Along with a range of new features for medium and large enterprises they’re also released a new licensing model. Now not only is the CPU resource a licence constrained commodity but so is memory.

VMware have introduced the concept of vRAM, a total amount of memory which can be assigned to virtual machines at any point in time. Now as you buy more licences not only does the number of CPU cores you can use increase but so does the amount of memory you can use.

The future of licensing?

For now, VMware’s licensing model change is being seen as “bad” because it’s different, and will mean users will have to potentially pay more money to use the server hardware they’ve only just installed vSphere 4.1 on. There was a similar response when SQL Server 2008 R2 Enterprise Edition dropped support for an unlimited number of virtualised instances. Was that the first time we saw the beginning of Microsoft’s reaction to “all you can eat” licensing on modern server hardware?

The reality is from the vendor’s perspective something has to be done to maintain their revenues as server capabilities increase at the rate they have done in recent years. Microsoft have obviously been able to afford to offer an “all you can eat” per-Processor model albeit with a change in the definition of unlimited, but it will be interesting to see how the next generation of on-premise licensing models adapt to the multi-core multi-GB memory servers that we’re regularly deploying today.

For more information on the recent changes to the VMware licensing model please see the link here.

Ultra-high CPU Usage
06 July 11 10:45 PM | GavinPayneUK | 1 comment(s)

In the last couple of weeks I've seen a few database servers where sqlserver.exe was almost constantly at 100% CPU usage, usually they were servers I’d never seen before which didn’t help!

If you've had your environment designed or tuned by the SQLCat team then you’d probably be happy that you were getting every ounce of capability out of your hardware. Reality for most people is that 100% CPU usage is bad and indicates something somewhere is wrong, at the same time your users will all have probably stopped working as well.


Sometimes the initial reaction is to either find/buy more CPUs, or even more tactically, re-start SQL Server.  Neither of these are a long term solution, one just means more CPU resource will get burnt when the problem re-occurs, the other is a very short term fix giving perhaps a few minutes grace until the problem re-occurs.

 

Where to begin looking

The first place I look when I see a system in an ultra-high CPU state is outside of SQL Server and at what Windows and the hardware is doing.  I perform a quick check of some vital system stats:

Windows Pages/sec, % Pagefile utilisation, Available Memory
Disk latencies, IOPs and KB/sec
NIC % utilisation
Sometimes I’ll also look at % Privileged Time if only to tell me whether to check out anti-virus, DoubleTake or other system software.

Then I’ll look at the SQL Server errorlog.  This was where I found evidence of a recent cause of ultra-high CPU usage:

SQL Server’s memory being paged

On one server the backup software had a memory leak and was using over 3GB of memory.  This caused Windows to page out everything and anything it could find, including SQL Server’s memory.  Events were written to the SQL Server errorlog saying that “A significant part of sql server process memory has been paged out” and at the same time the % pagefile utilisation counter was up at 60%.  Stopping the backup service freed up over 3GB of memory and almost immediately the CPU usage by sqlserver.exe dropped from 100% to…10%.

 

The second cause I’ve seen a couple of times recently again had very little to do with CPU cycles from the outside but is an area which we all know can have a huge impact on SQL Server’s performance:

Missing Indexes

After the two checks I mentioned above the next thing I do is open Activity Monitor within Management Studio.  In an ideal world I’d use Perfstats to perform a more much detailed capture and then analyse it in SQLNexus but usually by this time you start hearing comments like “the system is down” and “no one can work”.

Within 60 seconds you can start to see trends in the top performing queries, whether ordered by CPU, executions or reads.  With those top performing queries I then get their execution plans into SQL Sentry’s Plan Explorer. 

Plan Explorer does what I need it to do for me in a hurry, it finds and highlights the most expensive operations in those queries and at this point I will often find:

  • Missing clustered indexes
  • Key/RID lookups
  • Table and index scans

In some situations the creation of 4 or 5 carefully selected “emergency” indexes based on the observations from Plan Explorer can also drop the CPU usage down from 100% to single digit values.  These can be turning heaps into clustered indexes to reduce locking and latch contention, implementing covering indexes to stop lookups or converting expensive table scans into index seeks.

If that doesn’t help then the next area I check is:

Statistics and Index Housekeeping

As we know SQL Server’s query optimiser loves having up-to-date statistics while its IO processes hate logical or physical fragmentation.  On some systems there is only weekly re-indexing which in a database with missing clustered indexes or heavy page split rates can become heavily fragmented in a matter of hours, so after a few days performance can really struggle.  In one case a check of the stats data using DBCC showed that running sp_updatestats was well overdue and fortunately 90 seconds later bought the CPU utilisation down from 100% to something much lower.  Re-indexing on a production system isn’t something I’d normally want to do, but again if the data shows it needs running and the system is already “dead” to the business users it might be a decision you take. 

 

Your Mileage Will Vary Along With Your Troubleshooting Methods

The observations and solutions I’ve written about above were point in time situations and remedies which I may never see again, however they are useful checks for when I see a poor performing system again.  They’re not the only causes of ultra-high CPU usage and my remedies weren’t the only solutions to the issues we faced.  There are also much better ways to more accurately diagnose high CPU usage, however if sharing my experiences means someone who does a Google search in a hurry finds some benefit from them I’ll feel this post was worthwhile.

Copying files–Slowly and Responsibly
16 May 11 09:18 PM | GavinPayneUK | with no comments

I’ve been working with some clients recently who needed to copy large database backups between environments and learnt a couple of things I’ll now share.

Copying Files Slowly

In one instance we had to copy the backup files over a shared network link so wanted to stop the copy monopolising the limited bandwidth.  Our first attempt was to use normal xcopy but out of hours when the link wasn’t in use and while the copy uses most of the bandwidth like we expected it would it was also affecting the performance of the server itself.  The server’s 100Mb NIC utilisation was pegged at 98% and the ASYNC_NETWORK_IO wait stat count in SQL Server was climbing by the second, making the database server’s responsiveness to their web application poor.  As a result, that option was ruled out.

We then considered robocopy knowing it has an Inter-Packet Gap setting which how it uses network bandwidth.  Although this parameter had some influence over the amount of bandwidth it uses it proved difficult to limit it to a large amount of bandwidth.  Reading around it seems as though the option was designed for <1Mb WAN links rather than the 100Mb backbone we were using.  In the interest of time I had to search for another option, but the IPG setting is something I want to look into further as robocopy is good for scripting copies.

In the end the solution came from Google search that found a small, simple and free GUI based tool called SuperCopier (link). Amongst its copy options is a slider tool which allows you to accurately dictate how much bandwidth it uses.  While you have to remember to convert MB into Mb the bandwidth option did what it said it would do and for us limited the copy to 80Mb/s.

image

 

Copying Files Responsibly

Another requirement I had was to copy files as quickly as possible from the server’s SAN storage to a portable USB drive and as part of some research into the best tool for the job the client found this MSDN blog article (link) about copying large files over slow links.

This article was written from the angle of making the copy work as fast as possible but also discussed an area I’d never read about before – un-buffered I/O operations.

When Windows copies files normally it uses the system cache on the assumption that the file is probably going to be read again soon so caching it is a good move.  This can be really annoying if we know the data is never going to be read again, and if we’d rather always keep the memory for something else, like for SQL Server.

As an example, I copied an 8GB file around my laptop and you can see the effect it had on system memory during the copy:

image

image


When I’m copying large files to or from a production database server the last thing I want to do is cause large amounts of memory re-allocation or memory pressure because of that seemingly low priority copy.  The MSDN article the client discussed a new XCOPY parameter in Windows Server 2008 R2 – /J.  Using the /J parameter will perform a file copy using the un-buffered Windows file copy APIs.  As a result, the file bypasses the Windows caches and the size of the system cache doesn’t have the potential to grow.

When I did the same copy as above but using the /J parameter things looked very different:

image

image

Incidentally, for the file I was copying the copy took the same amount of time, although I’ve yet to test larger files to see if it degrades the copy time.

 

What I’m not saying is that copying large files using the standard buffered I/O APIs is bad, Windows and SQL Server have good memory management processes to avoid such things typically being a threat to SQL Server’s performance.  What I do like though is knowing I have the ability to perform large file copies and remove the risk of Windows deciding to bloat the system cache un-necessarily.

More Posts « Previous page - Next page »