October 2007 - Posts

What's the "cost"?
14 October 07 06:15 PM | Christian | 1 comment(s)

SQL Server uses a cost-based based optimizer which means that it will assign a "cost" to each execution plan and then choose the plan with the lowest cost.  You can see evidence of this when you look inside an execution plan and see things like "CPU cost" and "subtree cost". 

If you've been around SQL Server for a while you'll know that this cost doesn't translate into seconds or anything meaningful and is just an arbitrary number used to assign a value representing the resource cost.  You might even know that the "cost" value was benchmarked on a machine at Microsoft early in SQL Server's life (probably 7.0) but have you ever wondered what this historic supercomputer looked like?  Wonder no longer, I've attached a picture of it.  Have a look, it'll make you smile :)

I was on SQL Ranger training (now MCA: Database) in Redmond back in May and this question of cost came up at one of the sessions that Lubor Kollar was running.  For those of you that don't know Lubor he owned the development of the SQL Server optimizer for many years and now works in the Customer Advisory Team.  He explained that many years ago a guy worked for him called Nick and it was Nick's job to benchmark query times for the optimizer team.  Those benchmarks became the basis for the cost-based optimizer.

Lubor said that they looked into changing it but it had so many dependancies within SQL Server that it was prohibitively expensive to change.  Nothing would really be gained from changing it anyway as they'd be replacing what is now an arbitrary value for another arbitrary value.  So that is why we still have it today and why you'll see cost values with lots of decimal places; what took 1 sec on the benchmark machine runs a lot faster on your more modern machine.

Next time your looking at a cost in an execution plan spare thought for Nick and his historic machine, it might even make it all seem less complicated now you know its provenance!









Disk Sector Alignment
09 October 07 07:30 PM | Christian | 16 comment(s)

I’ve just finished 3 chapters for an upcoming Wrox book called Professional SQL Server 2005 Performance Tuning: http://www.amazon.co.uk/Professional-Server-2005-Performance-Tuning/dp/0470176393/ref=sr_1_11/202-3946698-6401428?ie=UTF8&s=books&qid=1191700832&sr=8-11

I joined the author team late so my picture’s not on the cover pic yet.  I know its vain but I can’t wait J

Anyway, what I wanted to post on was a section I wrote on Disk Sector Alignment.  It seems to be one of those things that you've either been doing for years or you've never heard of.  When I left Microsoft some of the guys were trying to get something published “officially” but I don’t know how much interest they managed to get.  I've been meaning to blog on it for ages...

Sector Alignment

Sector Alignment is a little known performance optimization tip that documented for Exchange here: http://technet.microsoft.com/en-us/library/aa998219.aspx (but not very well described) and is equally valid for SQL Server.  There are a few blog postings that try to measure the performance difference for SQL Server but why it can cause a problem is not detailed very well. It should be performed on any NTFS partition that is created on a RAID array to “avoid misalignment with stripe units and caching which can cause performance degradation”. Let’s look in a bit more detail about what that actually means.

First of all, some disk terminology. What you’re interested in for the purpose of explaining sector alignment are sectors, tracks and blocks.

*        A sector is the smallest unit of storage space on disk and is typically 512 bytes.

*        A track is the circular path of sectors that sit around a single circumference of the disk. There are 63 sectors per track numbered 1-63 on each track.

*        A block is how you usually refer to sectors to make things easier for you. You start at the beginning of the disk and increment until the end of the disk. Blocks start from 0.

Partitions always start at the first sector on a track, so for the first partition that means Track 2 Sector 1 (because the MBR is on Track 1), which equates to Block number 63 (blocks start from 0). Misalignment occurs because the storage vendors define tracks differently. On an EMC Symmetrix, for example, a track is considered to be 64 blocks and the cache is also based on this.

A Symmetrix DMX RAID5 array uses a stripe size of 4 tracks, which totals 256 blocks. 64KB is the largest single write that Windows can make, so using the default partition location means that the first 64Kb write will be fine but the second will straddle 2 “stripes” causing both to be updated.

If you align the partition to 128 blocks (or 64KB as each block is 512bytes) then you don’t cross a track boundary and therefore issue the minimum number of I/Os.

There are 2 methods of aligning your partition and you’ll be pleased to know that it’s quite a simple process. Diskpar is a tool that’s been available for a while to enable you to align partitions but as of Windows 2003 SP1 the preferred method is Diskpart (note the extra ‘t’). The main difference between the two methods is that you specify the alignment value in blocks using Diskpar and in KB using Diskpart.









Filed under:
How to corrupt a page in a specific index and then fix it
07 October 07 10:19 AM | Christian | 1 comment(s)

I was developing course materials for a workshop recently and I wanted to run a lab where you DBCC’d a database, found an error and then worked out how to fix it.  The only problem was deliberately corrupting a page in a specific index.  I did some investigating and thought I’d share my findings:


If you just need a broken database to play with then you can download one from Paul Randal’s old blog here: http://blogs.msdn.com/sqlserverstorageengine/archive/2007/04/17/example-corrupt-database-to-play-with.aspx


Tony Rogerson has a good blog here on how to corrupt data without using a Hex Editor:



My needs however where a bit different because I needed to corrupt a page in a specific non-clustered index in a database that my customer gave me to develop the workshop with.  I’ll use AdventureWorks to illustrate...


First of all you need a Hex Editor because you’re going to change some hex values in the mdf file of the database.  I used Hex Workshop which you can download from http://www.hexworkshop.com and use free for 35 days.  If you stop SQL Server and load AdventureWorks_Data.mdf into the editor you’ll see a stream of hex and no easy way to see what’s what.  Close the file and restart SQL Server.


Now change the PAGE_VERIFY option to NONE so that the checksum being different doesn’t intercept with an error before we can look at the DBCC results.




You’re going to be corrupting the AK_CustomerAddress_rowguid non-clustered index on the Sales.CustomerAddress table so the first thing we need is a page number that contains data for that index.  I found the solution on page 218 of Kalen Delaney’s Inside SQL Server 2005 Storage Engine book.


SELECT object_name(object_id) AS name, index_id,rows, type_desc AS page_type_desc, total_pages AS pages, first_page

FROM sys.partitions p JOIN sys.system_internals_allocation_units a

ON p.partition_id = a.container_id

WHERE object_id=object_id('sales.customeraddress')


This gives you:


name             index_id rows  page_type_desc pages first_page

---------------- -------- ----- -------------- ----- --------------

CustomerAddress         1 19220    IN_ROW_DATA   121 0xA02800000100

CustomerAddress         2 19220    IN_ROW_DATA    81 0x381200000100


There is only a clustered index and the non-clustered index on the table so the second row is the one you want.  Now back to Kalen’s book to translate the hex into a more familiar page reference:


create function [dbo].[covert_page_nums] (@page_num binary(6))

RETURNS varchar(11)



 RETURN(convert(varchar(2), (convert(int, substring(@page_num, 6, 1)) * power(2, 8)) +

  (convert(int, substring(@page_num, 5, 1)))) + ':' +


 (convert(int, substring(@page_num, 4, 1)) * power(2, 24)) +

 (convert(int, substring(@page_num, 3, 1)) * power(2, 16)) +

 (convert(int, substring(@page_num, 2, 1)) * power(2, 8)) +

 (convert(int, substring(@page_num, 1, 1)))) )



SELECT dbo.convert_page_nums(0x381200000100)


Gives you the starting page of the index as 1:4664


Now that you know where the index is you can use DBCC PAGE (described here: https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/10/625659.aspx ) to see what’s on it.  You know from the previous scripts that the index covers 81 pages so add 1 to the starting page to put you into the index tree.


DBCC TRACEON (3604, -1)


DBCC PAGE (AdventureWorks, 1, 4665, 1)


The important output here  is the content of Slot 0 (the first row on the page) and Slot 298 (the last row on the page):


Slot 0, Offset 0x60, Length 25, DumpStyle BYTE


Record Type = INDEX_RECORD           Record Attributes =                 

Memory Dump @0x61C4C060


00000000:   06852aa3 b42f26c8 4988b004 3267c944 †..*../&.I...2g.D        

00000010:   caae2b00 004e2f00 00†††††††††††††††††..+..N/..                



Slot 298, Offset 0x1d7a, Length 25, DumpStyle BYTE


Record Type = INDEX_RECORD           Record Attributes =                 

Memory Dump @0x61C4DD7A


00000000:   0613f0ce 88184df0 4d8b5508 3d90ba87 †......M.M.U.=...        

00000010:   e71e5800 008f3c00 00†††††††††††††††††..X...<..    


Now you can stop SQL Server and load the mdf file into the Hex Editor.


Search for 06852aa3.  The first result takes you to offset 38215776 (decimal) and the subsequent hex values exactly match what you know of the contents of the first row from the DBCC PAGE results.  Continuing the search reveals no other exact pattern matches.


Search for 0613f0ce.  The first result takes you to 38223226 (decimal) for another exact match and there are no other instances of value.


You now have the beginning and end of the page you want to corrupt and all that’s left to do is to overwrite the data in between with zero’s.  In Hex Workshop you can highlight the data and press <Ctrl-I> to overwrite it all.  Save the file and restart SQL Server.


Now run DBCC CHECKDB (AdventureWorks) and you’ll see:


Msg 8928, Level 16, State 1, Line 1

Object ID 725577623, index ID 2, partition ID 72057594048872448, alloc unit ID 72057594054311936 (type In-row data): Page (1:4665) could not be processed.  See other errors for details.

Msg 8941, Level 16, State 55, Line 1

Table error: Object ID 725577623, index ID 2, partition ID 72057594048872448, alloc unit ID 72057594054311936 (type In-row data), page (1:4665). Test (dbccAuditRecSucceeded == TRUE) failed. Slot 0, offset 0x60 is invalid.


Horray, mission accomplished ! J You can even back it up and add the restore into your demo/lab steps.


Imagine this was your production database, what would you do? Restore? DBCC CHECKDB REPAIR_REBUILD? They’d both work but they’re unnecessary overkill in this situation.  Find out what the index is first:


SELECT object_name(object_id)AS 'table',name,index_id,type_desc FROM sys.indexes

WHERE object_id = '725577623'

AND index_id = 2




table             name                         index_id type_desc

----------------- ---------------------------- -------- ------------

CustomerAddress   AK_CustomerAddress_rowguid          2 NONCLUSTERED


Its a nonclustered index so we can rebuild it!


ALTER INDEX AK_CustomerAddress_rowguid ON Sales.CustomerAddress REBUILD


And now CHECKDB runs without error. Hooray, you’ve just saved the day!








Christian Bolton


Filed under: ,
My first post
06 October 07 08:47 PM | Christian | 2 comment(s)

Hello, my name is Christian Bolton and I’ve spent the last five years working as a Premier Field Engineer for SQL Server at Microsoft UK.  I’m now a Director and Database Architect for a Microsoft Partner called Coeo (http://coeo.com) who specialise in deep-dive SQL Server engagements.  My specialist areas are High-Availability, Scalability, and Performance Tuning.  I’ve finally got around to setting up a blog and I’m going to be posting on my experiences working with SQL Server at some of Microsoft’s largest customers as well some interesting work I’ve got lined up in my new role at Coeo.  I hope you find it useful!