18 July 2011 07:38 tonyrogerson

Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

Using GUID’s (uniqueidentifier) for surrogate keys has always and still is frowned on by database designers and optimisation consultants alike and rightly so, the fragmentation and randomness of how rows are stored and accessed can cripple performance with traditional storage systems.

People on all sides are acutely aware of the benefits GUID’s bring for distributed processing whereby the middle tier is contained on many servers or the database is spread across many physical machines.

Solid State Drives have been a game changer, they remove problems associated with traditional rotational disk technology because they not only offer uniform seek times across the storage but also the read latency for fully random reads is consistent and extremely low, they do have a weakness – cells burn out on write intensive workloads, Multiple Layer Cells (MLC) this around 10K cycles, for Single Layer Cells (SLC) this around 100K cycles and for the newer Phase Change Memory (PCM) this around 100M (http://www.fastestssd.com/articles/pcm-based-storage-a-game-changer/).

This article seeks to demonstrate that we as database designers and optimisation specialists need to adjust our approach and realise SSD’s provide a unique opportunity to scale out as well as up the database and offer less complicated database implementations especially as now like never before there is a greater need to spread data across many physical machines. The article is split into two parts, the first gives background for SSD, and the second the actual benchmarks.

What is really important IO size or query duration?

We should only be concerned with query duration, queries should return in an acceptable duration depending on the query being performed, user expectation and number of concurrent queries executing. Query speed is the outcome from a number of parameters, the rather simplistic formula goes something like queryduration = ( iolatency + iothroughput + cputime + blocking )* concurrentusers.

I’ve presented a number of times now on the commodity “reporting brick” I’ve built around an AMD Phenom 1090T, 16GB RAM and 2 x 240GB OCZ IBIS drives in a RAID 0, I’ve shown how all 6 cores are needed and top out in order to get the full potential out of the two IBIS drives, when limited to 1 or 2 cores instead of 1.3GBytes per second of throughput, the throughput drops off to a pitiful (though good still) 130-260Mbytes per second.

SO – are we really bothered about the IO subsystem when it can outperform the CPU cores?

Query speed against data held on rotational disk is not consistent; it is subject to the non-uniform seek latency that is dramatically affected if the data is fragmented or held randomly across the sectors on the disk.

Let’s have a look at the hardware that relates to performance that is a) the IO sub-system for getting the data off and b) the system RAM for the SQL Server buffer pool.

Hardware

Costs and Classification of Solid State Drives

SSD’s can be classified into commodity or enterprise; SATA attached or PCI attached.

Commodity SSD tends to be the cheaper MLC which has a lower write life, MLC is also found in some enterprise SSDs but SLC is predominate because of its higher write life. Cell burn issues and therefore the life of the drive extended by a number of techniques – in the enterprise space they are simply a lot better and there is also more redundancy on the stated drive size.

Costs of SSD can vary dramatically, for enterprise PCI based SSD is a must because of the reduced latency and the hardware is specifically designed around SSD rather than having to use a legacy protocol such as SATA and drivers there of which is designed around rotational disk technology and the problems associated with it for performance optimisation.

The OCZ Z-Drive product line (http://www.ocztechnology.com/ocz-z-drive-r3-p84-pci-express-ssd.html) is currently the best price per Gigabyte; you can buy a 1TB card for £4K (http://www.amazon.co.uk/OCZ-Z-Drive-PCI-Express-SSD-p84/dp/B003EEMIH2/ref=sr_1_4?ie=UTF8&qid=1310737655&sr=8-4) but hold out and wait for the R3 which is due shortly; a single card will give you 1GBytes per second throughput and capable of 135K IOs/second – the read latency will also be a couple of milliseconds at those levels as well – that’s fully random IO!

Just a thought to throw in the pot: consider that when you overwrite a piece of data on the SSD you aren’t necessarily overwriting, you are marking previous cells as deleted and writing information, now, write performance degrades if TRIM hasn’t caught up and reset the cells; so, would it not be logical to buy an SSD that has far more storage capacity than you actually need, for instance, if you only need 240GB for your database, buy one that is twice that, that would reduce the possibility of you hitting cells that you’d have to do a full erase cycle in order to write over.

Costs and Classification of Rotational Drives

We’ve had rotational speeds of 5.4K, 7.2K, 10K and then in 2001 technology hit a wall at 15K rpm, that’s 10 years ago! Even now the size of 15Krpm disks only go up to 147GB, at 10Krpm we have disks up to 900GB. The cost for a 15Krpm 147GB is around £185 ex VAT, the average access time a whopping 3.4 milliseconds.

How many disks would you need to create a disk array that could keep up with a pair of OCZ IBIS Drives in a RAID 0 delivering over 80K 64Kbyte 100% random reads per second at a latency of 1.2milliseconds?

Rotational disks are just fine for the transaction log because so long as it’s the only thing being written to on that array then the seek times are going to be extremely low.

Cost of memory and Limits

I recently cost up a DL385 G7 server, it will hold up to 256GB of RAM; 16GB of RAM cost £575 so for 256GB it would set you back £9.2K.

Summary of Part 1

Enterprise SSDs are still expensive when looked at from a price per GByte, however, from a read/write latency per thousands of IOs/second they are extremely cheap.

Operational care should be taken if the data access pattern is high overwriting (UPDATE) rather than mostly INSERT because you may quickly burn the SSD out. Design considerations come into play here, you can easily implement a type 2 Slowly Changing Dimension (SCD) in the operational system.

In part 2 we will look at the benchmarks from various real world tests against a 50 million row 85 GBytes table which has over 99% fragmentation in the leaf and other levels of the index. You may well be surprised at what you will see and it’s a game changer in terms of the way we currently think.

Comments

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

18 July 2011 08:45 by jamiet

This ain't the sort of stuff I usually concern myself with Tone but I'm looking forward to part 2.

JT

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

18 July 2011 09:16 by jrsansom

This is really interesting research Tony and is the type of information that is making me rethink some of our own "best practice" solution designs for future database infrastructure planning.

Thanks for sharing your findings and looking forward to part 2.

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

18 July 2011 09:45 by GrumpyOldDBA

I've done some tests with SSDs myself, the problem always is looking at it from an enterprise view.

Could I use raid 0 on a production box - No!

Will it work on a cluster - plug in cards no.

Actually is the disk subsystem the bottleneck - not very often unless the database is large ( ish )  by this I probably mena a TB or greater - would the cost of SSDs be effective  - probably not.

15k sas disks size up to 600GB btw.

I'm not 100% sure about SSRS - I do have performance issues but I'm almost 100% certain it's not disk related, mainly as I monitor io and the figures rarely reach even 100 io/sec.

I agree 100% that SSD provide stunning performance but I'm still unconvinced that for the systems I work with I'd gain any performance vs increasing the the server memory.

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

18 July 2011 10:07 by tonyrogerson

Clusters are dead - Denali will see the end to that with availability groups, no more single point of failure. We want simpler methods of creating data redundancy.

Take your point with 15Krpm's - but how many do you realistically need to get the read latency and IO's per second we need? Lots and its getting worse....

Perhaps you are monitoring and 100 io/sec is all the IO subsystem can deliver with that random workload? Have you proved random read performance using IOMeter?

Like I say, SSD's are a complete game changer and our current thinking and best practices need to change to adapt.

T

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

18 July 2011 13:22 by GrumpyOldDBA

But you still have the issue of centralised storage, as you know I'm not a lover of SANs purely due to poor performance, but sitting a 10TB database on redundant SSDs would be very expensive. Yeah I'm well sure none of my SSRS servers are even touching physical io.

I take your point about the cost of server memory vs ssd - now that would be an interesting benchmark  - using low server memory and ssds vs high server memory and physical disks/ssds.

Can't comment about availability groups as I haven't found any good sources on what exactly will be in Denali.

I remember being told somewhat sharply ( not by you ) that placing Tempdb on SSD was a "foolish" idea as the internals of SQL Server assumed a certain amount concerning storage so the optimiser would have "issues" - well I'm not at that level to know one way or another and neither do i have the resource to craft such a server to run such tests; but I've figured for some years that SSD might be a real boon in Tempdb performance. It'd be interetsing to discuss this face to face, I'm very interested in what you're doing, even if it sounds like I don't agree < grin >

# Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 2 - Initial Database Schema)

In relational to my blog post on guids and ssds (part 1) the script below forms the bais of the permanent

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 1 - SSD Backgrounder)

24 July 2011 03:42 by TheSQLGuru

Clusters are and will absolutely NOT be dead with Denali.  For one thing it will be 2 years before it gets significant market share.  For another much more important reason AlwaysON requires Enterprise Edition - which limits it to a much smaller percentage of the installed base.