NTFS Fragmentation - Bad for Databases ?

( Note: I had a couple of attempts to post this before I figured out how to publish pictures )


So what has this to do with the DBA you might say? Well I mostly make my living from tuning databases and applications. I always start with a hardware audit as it makes little sense to spend lots of money on a consultant when a simple hardware upgrade might resolve the problems. It is very very important to realise that many SQL Server implementations do not follow best practices, this can be for any number of reasons, many of which are often beyond the control of the DBA. For instance I was  faced with Servers with six disk slots and no external storage. Whichever way I decide to go is going to be a compromise. It's very easy to be the purist but this doesn't help real world. After showing all that go wrong I'll make a few suggestions as to how you may be able to mitigate some of the negative effects of fragmentation. Remember don't bring problems bring solutions!!

A quick check with Disk Defragmenter, available from My Computer, Manage, will give a quick indication of anything untoward. You'll hopefully see something like this













However you might see this













So far so good, there’s 40% free space so a defrag will be fine, except for this






















What has happened is that the fragmentation is so interwoven there is no contiguous free space to enable defragmentation of files, and I have 65% free space !!





















And sometimes you just can’t get it to run!


Database files are very prone to suffering fragmentation due to auto grow, the more databases upon your server the more likely the problem. As database files occupy a greater percentage of the available disk space the issue of defragmentation becomes more critical. It is quite possible to have adequate free space but no large enough contiguous space. ( If you have a 80Gb file which is fragmented you need 80Gb contiguous free space for a fast and effective defrag ) This can occur with fragmentation of the MFT, these are system files so will not be defragmented and a badly fragmented MFT really gives you little option other than to reformat your disk. ( There are some third party tools which will deal with MFT fragmentation I believe )










Bad file fragmentation











Fragmented MFT


So having established that even regular maintenance still shows severe fragmentation what steps can you take to lessen the effects?

  1. Make sure that auto shrink is not enabled on any of your databases
  2. Make sure you're not running any regular shrink file jobs - this is no different to a controlled auto shrink.
  3. Set the growth of your database devices to large amounts - If your database is 10Gb make the growth step 1 Gb - as DBA you need to be in control of growth. Back in the days of SQL 6.x there was no auto grow, growth had to be pre allocated, pre allocated growth allows you to control fragmentation much better. Better one 1Gb growth then 100 10Mb growths.
  4. Set your system databases including tempdb the same .. ideally you want 0% dynamic growth of these databases. I usually set master and msdb to an initial 25Mb with a 10Mb log and growth of 10Mb. The actual values will depend upon your usage of these databases, I have a custom stored procedure which clears old data from msdb, this helps maintain size, after all do you really want 5 years of database backup information?
  5. You might want to consider migrating your larger databases to multiple files ( not file groups ) It would be easier to defragment 8 10Gb files than one 80Gb file.
  6. Use file groups to separate static from dynamic/growing table - you might also want to use multiple files within your file groups, again on the basis of ease of working on smaller files.
  7. Add more memory - I read so many posts concerning database segmentation to avoid "hot spots" but ultimately the easiest way to avoid hotspots is to put the data in cache. Consider a 100Gb database residing on a server with minimal memory, say 3Gb, at best the data cache will be about 1.25Gb, actual value will vary but whichever way you slice and dice your memory it supports no more than around 2% of your data. Taking the memory to say 32Gb will give you a 30% data cache. Does it work? Well I upgraded an 8 way box from 8Gb to 32Gb , cpu dropped by 75% and actual throughput ( transactions etc.) jumped by around 50% , i/o was turned around with minimal read activity compared with before. As with all changes to server hardware you may well introduce a new bottleneck!
  8. If possible replace your disks with larger spindle sizes, having more free space makes defragmentation easier. It used to be recommended to leave at least 25% free space on a disk/array, I personally think that for databases this figure should be higher, but again "it just depends" if your databases don't grow then it's not so much a problem, however if they don't grow you won't get fragmentation of the database files, just backups.


It worries me when I read about virtualisation allowing for better than 85% disk utilisation as a selling point - that leaves so little space for defragmentation!

Now I've touched on a whole range of subject areas, which if nothing else should provoke a healthy discussion, so a quick explanation of how and why you can offset the degraded performance of fragmentation but not remove the fragmentation.

Most database reads use Read Ahead, which is sequential, most database writes are random. Sequential reads on fragmented files really slow performance with the disk heads jumping all over the place, somewhat like visiting every shop in a mall at random instead of visiting each in sequence ( you'll walk less! ) Disk controllers attempt to turn random writes into sequential writes through caching and through elevator sorting, but generally writes are still random, so technically file fragmentation does not affect disk writes so intensely, the exception to this is of course log writes which are always sequential. An examination of wait stats will discover if your log writes are causing performance problems. I don't have a quick fix to log writes, technically more memory should help, yes log files should be on dedicated drives but life isn't always like that. We're really talking OLTP databases here of course, DSS systems have far less random i/o so the effect of fragmentation upon a DSS system is likely to be even worse. There are a couple of other points to consider, carving an array into separate logical disks only partitions the data, if a single partition on that array becomes fragmented it most likely impacts all the logical drives even if they themselves are not fragmented. I'd really advise against carving physical disks into logical disks it doesn't improve performance ( i/o's and throughput for an array/disk remain the same regardless of how the disks are carved - this is a prime area of performance problems with SAN's but that most certainly is out of scope for this article ) The sql server and sql agent logs by default fragment badly, separating those logs away from the data will help or cycling your error log using sp_cycle_errorlog before a defrag.

Published Wednesday, December 27, 2006 11:06 AM by GrumpyOldDBA


# SAN Disk Performance and configuration

Friday, February 2, 2007 1:22 PM by lawrenso

Hi All, Well here I am - my first blog . It was a post from GrumpyOldDBA that decided me to join and

# SAN Disk Performance and configuration

Tuesday, February 13, 2007 11:04 PM by lawrenso

Hi All, Well here I am - my first blog . It was a post from GrumpyOldDBA that decided me to join and