19 July 2011 07:10 tonyrogerson

Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 2 - Initial Database Schema)

In relation to my blog post on guids and ssds (part 1) the script below forms the basis of the permanent tables in the database.

TestGuid_INSERT forms a table that is 50 million rows and around 85.5GBytes in size.

The tests I will be doing as part of the benchmarking are real world tests for instance an INNER JOIN between the TestGuid_INSERT and a lookup table with 50K guids in it.

If you can think of any real world tests you want me to run then email me tonyrogerson@torver.net and I'll include in the benchmarks, better still - create the test yourself and be ready to run the benchmarks to better understand the game changer I'm talking about.

I've created the contiguous table on its own file group so I know the data is as contiguous as I can get it.

The 1GB start size and increment on the original test was deliberate to make it even more on-drive fragmented (it is already on SSD because of the mapping that goes on).

CREATE DATABASE [GUIDTest_SSD] ON  PRIMARY

( NAME = N'GUIDTest_SSD', FILENAME = N'E:\S2008R2\GUIDTest_SSD.mdf' , SIZE = 1GB , FILEGROWTH = 1GB )

 LOG ON

( NAME = N'GUIDTest_SSD_log', FILENAME = N'E:\S2008R2\GUIDTest_SSD_log.ldf' , SIZE = 1GB , FILEGROWTH = 1GB)

GO

alter database GUIDTest_SSD set recovery simple

go

USE GUIDTest_SSD

go

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

go

--    population script (2:55)

create table TestGuid_INSERT (

      guidcol uniqueidentifier not null default( newid() ) primary key clustered,

      expandrow char(1024) not null

      );

declare @i int = 1;

set nocount on;

while @i <= 50000000

begin

      insert TestGuid_INSERT ( guidcol, expandrow ) values( NEWID(), cast( @i as CHAR(1024) ) )

      set @i = @i + 1

end

go

 

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

go

USE [master]

GO

ALTER DATABASE [GUIDTest_SSD] ADD FILEGROUP [CONTIG_TBL]

GO

ALTER DATABASE [GUIDTest_SSD] ADD FILE ( NAME = N'GUIDTest_SSD_CONTIG', FILENAME = N'E:\S2008R2\GUIDTest_SSD_CONTIG.ndf' , SIZE = 100GB , FILEGROWTH = 1024KB ) TO FILEGROUP [CONTIG_TBL]

GO

USE GUIDTest_SSD

go

 

create   table TestGuid_SingleINSERT (

      guidcol uniqueidentifier not null default( newid() ) primary key clustered,

      expandrow char(1024) not null

      ) ON [CONTIG_TBL];

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

go

insert TestGuid_SingleINSERT ( guidcol, expandrow )

      select guidcol, expandrow

      from TestGuid_INSERT

      order by guidcol, expandrow

go

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

go   

USE [master]

GO

ALTER DATABASE [GUIDTest_SSD] ADD FILEGROUP [HEAP_TBL]

GO

ALTER DATABASE [GUIDTest_SSD] ADD FILE ( NAME = N'GUIDTest_SSD_HEAP_TBL', FILENAME = N'E:\S2008R2\GUIDTest_SSD_HEAP_TBL.ndf' , SIZE = 100GB , FILEGROWTH = 1024KB ) TO FILEGROUP [HEAP_TBL]

GO

USE GUIDTest_SSD

go

--    population script (2:55)

create table TestGuid_HeapINSERT (

      guidcol uniqueidentifier not null default( newid() ) primary key nonclustered,

      expandrow char(1024) not null

      ) on [heap_tbl];

declare @i int = 1;

set nocount on;

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

while @i <= 50000000

begin

      insert TestGuid_HeapINSERT ( guidcol, expandrow ) values( NEWID(), cast( @i as CHAR(1024) ) )

      set @i = @i + 1

end

go

select * from sys.dm_io_virtual_file_stats( DB_ID(), null )

go

Comments

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 2 - Initial Database Schema)

20 July 2011 14:21 by GrumpyOldDBA

I know this doesn't fit with your low cost box but thought this might be of interest? www.theregister.co.uk/.../hs_dorado_s2100

I have to say the maths is a bit dodgy mind you! - 200 disks to get 135k iops !!  yes if you're talking 7.2k sata but not 15k sas;  Of course the issue here is that 200 disks would support 60TB of storage vs 2.4TB ( based upon 300GB 15k disks )  which is a factor too often overlooked in the promotion of SSDs.

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 2 - Initial Database Schema)

20 July 2011 14:38 by tonyrogerson

SATA/SAS links aren't the answer to this problem - its going back to DASD - if you take a look at the appliance's coming out of Microsoft, guess what - they use DASD!

OCZ's HSDL and VCA are the way we need to be moving - those guys and FusionIO have it right.

I can get over 80K 100% random 64KB reads from my 2 IBIS drive 480GB array but importantly the AVERAGE read latency is just 1.3 milliseconds! I'd suggest even with 200 disks I'd suspect that wouldn't be achievable because you'd a) be fighting protocol baggage but more importantly you'd be b) fighting drive head sync issues because of the RAID - more disks in a RAID 0 you get diminishing returns.

The correct approach to all this is of course an Hybrid one - my final article (part 4 or 5) "archiecture conclusions" will deal with how you do it.

T

# re: Use GUID’s (uniqueidentifier) for keys when storing data on SSD’s (Part 2 - Initial Database Schema)

21 July 2011 10:40 by GrumpyOldDBA

Actually our DW team have fitted all SSD using Fusion io cards, I haven't heard yet how much performance they've gained over the previous san storage - they have many TB of SSDs. Hybrid is good which is where I started a couple of years ago trying to find out if putting Tempdb on SSD would be a good idea. Sadly I just don't have the cash, contacts or resource to put a suite of tests like that together.

Waiting for part 3 !!

# Something for the Weekend &#8211; SQL Server Links 22/07/11

Pingback from  Something for the Weekend &#8211; SQL Server Links 22/07/11

# Discovery - when your IO Subsystem out performs processor Cores - how parallelisation really is our friend

CPU Saturation – an over performing IO subsystem It’s not often that you see the IO subsystem able to