Introduction
Database mirroring is a really cool technology introduced in SQL Server 2005 to simplify log shipping and give greater availability and data redundancy at the database level, in this article I talk about the various modes of operation, how it works, discuss statements (INSERT, UPDATE, DELETE, CREATE INDEX, ALTER INDEX REORGANIZE which replaces DBCC INDEXDEFRAG etc...) and there effect on logging. I also look at in depth at how performance of your online system can be significantly impacted by latencies in network performance and how to overcome or reduce some of these problems.
This has sat unfinished for 3 months because of my client workload, so, I'm breaking it up into parts, this is the first part - the configuration and planning bit, the bit that makes sure you know what your performance is going to be 'like', more 'parts' when I get time.
There are two flavours – synchronous (full safety) or asynchronous, synchronous basically waits until the log records are applied and written to disk on the mirror before allowing the principal to commit or rollback a workload – I’ll discuss this in detail as we go through.
Database mirroring is only available in the Standard and Enterprise editions of SQL Server, standard edition only supports synchronous (full safety), and it is also limited to a single redo thread (discussed later).
With synchronous mode you have the option of using a witness, I’m not going to talk about that as it has been covered so many times and books online does it in depth as well.
Forming a base line for performance
1 – The Disk Bits
At this point you need to understand the range in terms of the size of a block of data written to disk when a log flush event occurs, when writing to the transaction log SQL Server writes in blocks ranging from 512bytes through to 64Kbytes – this is explained in another blog entry of mine on Transaction Log Performance.
We need to see what the disk subsystem on each machine is capable of without mirroring in place, we do this by running a number of tests simulating various flog flush sizes.
First create a database on each machine, correct the drive letters so the MDF and LDF are placed on the drives you use for the data and logs…
CREATE DATABASE [IOMeter] ON PRIMARY
( NAME = N'IOMeter', FILENAME = N'e:\IOMeter.mdf' , SIZE = 1048576KB , FILEGROWTH = 1024KB )
LOG ON
( NAME = N'IOMeter_log', FILENAME = N'g:\IOMeter_log.ldf' , SIZE = 1048576KB , FILEGROWTH = 10%)
GO
EXEC dbo.sp_dbcmptlevel @dbname=N'IOMeter', @new_cmptlevel=90
GO
ALTER DATABASE [IOMeter] SET RECOVERY FULL
GO
ALTER DATABASE [IOMeter] SET PAGE_VERIFY CHECKSUM
GO
Create the table to hold the test and result data…
USE IOMeter
GO
CREATE TABLE mytest (
id int not null identity primary key clustered,
somedata char(4000) not null
)
GO
CREATE TABLE io_results (
test_name varchar(100) not null,
entry_date datetime not null default( getdate() ),
write_ios int not null,
write_bytes bigint not null,
transaction_size int not null,
ios_per_row decimal( 10, 3 ) not null,
ios_per_xact decimal( 10, 3 ) not null,
bytes_per_io decimal( 10, 3 ) not null,
mbytes_per_sec decimal( 10, 3 ) not null,
rows int not null,
commits int not null,
duration_seconds int not null
)
Now our test proc, this stored procedure is what we call to do the test…
CREATE PROC test_perf
@test_name varchar(100),
@tran_batch_size int
AS
BEGIN
SET NOCOUNT ON
IF @@TRANCOUNT > 0
ROLLBACK -- Make sure no transaction open
WAITFOR DELAY '00:00:05' -- System settle, helps seperate stuff in PERFMON
TRUNCATE TABLE mytest
CHECKPOINT -- Write any dirty pages to disk
DBCC DROPCLEANBUFFERS -- Empty data cache
DBCC FREEPROCCACHE -- Empty execution cache
WAITFOR DELAY '00:00:05' -- System settle, helps seperate stuff in PERFMON
DECLARE @i int
DECLARE @c int
SET @i = 1
SET @c = 0
SELECT 'START' AS op, GETDATE() AS op_time, *
INTO #filestats
FROM ::fn_virtualfilestats( db_id(), 2 )
BEGIN TRAN
WHILE @i <= 100000
BEGIN
INSERT mytest ( somedata ) VALUES( CAST( @i AS char(4000) ) )
IF @i % @tran_batch_size = 0
BEGIN
SET @c = @c + 1
COMMIT TRAN
IF @i < 100000
BEGIN TRAN
END
SET @i = @i + 1
END
IF @@TRANCOUNT > 0
COMMIT TRAN
INSERT #filestats (
op, op_time, DbId, FileId, TimeStamp, NumberReads, BytesRead,
IoStallReadMS, NumberWrites, BytesWritten,
IoStallWriteMS, IoStallMS, BytesOnDisk, FileHandle )
SELECT 'END', getdate(), DbId, FileId, TimeStamp, NumberReads, BytesRead,
IoStallReadMS, NumberWrites, BytesWritten,
IoStallWriteMS, IoStallMS, BytesOnDisk, FileHandle
FROM ::fn_virtualfilestats( db_id(), 2 )
INSERT io_results (
test_name,
write_ios,
write_bytes,
transaction_size,
ios_per_row,
ios_per_xact,
bytes_per_io,
mbytes_per_sec,
rows,
commits,
duration_seconds )
SELECT @test_name,
Write_IOs = e.NumberWrites - s.NumberWrites,
Write_Bytes = e.BytesWritten - s.BytesWritten,
transaction_size = @tran_batch_size,
IOs_Per_Row = CAST( (e.NumberWrites - s.NumberWrites) AS decimal( 10, 3 ) ) / 100000,
IOs_Per_Xact= CAST( (e.NumberWrites - s.NumberWrites) AS decimal( 10, 3 ) ) / @c,
Bytes_Per_IO= CAST( ( e.BytesWritten - s.BytesWritten ) AS decimal( 20, 3 ) ) / (e.NumberWrites - s.NumberWrites),
MBytes_Per_Sec = ( CAST( ( e.BytesWritten - s.BytesWritten ) AS decimal( 20, 3 ) ) / DATEDIFF( second, s.op_time, e.op_time ) ) / 1048576,
Rows = (SELECT COUNT(*) FROM mytest),
Commits = @c,
duration_seconds = datediff( second, s.op_time, e.op_time )
FROM #filestats s
CROSS JOIN #filestats e
WHERE s.op = 'START'
AND e.op = 'END'
END
GO
Now we are ready to conduct some tests;
2 – The Network Bits
Before going any further you need to determine just how good your connectivity is to your mirror machine and how much “Write Mbytes per second” you can expect which is the single point of failure in terms of system performance for write intensive or large update queries.
We are going to use IOMeter which is a free open source application; this really cool utility allows us to create a number of tests varying the “transfer request size” to simulate mirroring spewing log records across the network pipe.
On the machine that will be the Principal map a network drive to the Mirror machine e.g. NET USE Z: \\10.1.1.2\E$
When you start IOMeter select the disk target to do the test against, this will be the drive the transaction log will reside on (step 2) or Z: drive you mapped in the proceeding step (when doing step 3). Set the maximum disk size to 4096, leave “# of Outstanding I/Os” at 1 which simulates the same behaviour as SQL Server transaction log flushes i.e. no outstanding I/O – stuff must be written (write thru rather than write back).
Figure 1 – IOMeter
I always recommend using dedicated NIC’s between the two machines and use a good quality cross over cable. The NIC’s should run at 1Gigabits at least full duplex, that bandwidth allows for over 100Mbytes per second throughput compared to 100Mbits which offers only around 12Mbytes per second; anyway – to testing.
At this point you need to understand the range in terms of size of data transmitted between the two servers, when writing to the transaction log SQL Server writes in blocks ranging from 512bytes through to 64Kbytes – this is explained in another blog entry of mine on Transaction Log Performance.
Figure 2 – Transmission Results
The graph tells us a lot, the larger your chunk of data the better the throughput. Before going any further – tune this! Make sure you are using the most recent drivers! Don’t be afraid of adjusting the parameters available for the NIC (do make sure you know what they represent and what they effect first though), set the speed instead of letting it auto-sense, different NIC’s give different performance for example using my onboard NIC’s I get a throughput of 102Mbytes per second for 512KB. There are two registry entries you can ‘play’ with, but make sure you do it at the interface level (for the specific NIC) rather than for all NIC’s – that is the MTU size (Maximum Transmission Unit) and TCPWindowSize – the white paper “Microsoft Windows Server 2003 TCP/IP Implementation Details” gives a very thorough discussion of what these mean.
When setting up the cross over cable and NIC’s put them on a separate subnet I’ve used 10.1.1.1 (255.255.255.0) as the principal and 10.1.1.2 (255.255.255.0) as the mirror, so my configuration is shown below. Make sure you set the metric on the NIC’s so that the cross over connection has a high value so other traffic won’t get routed through there. I also uncheck the register in DNS for those IP addresses, they are private and don’t belong on the DNS server; either use the IP addresses or set up an alias in your HOSTS file in c:\windows\system32\drivers\etc directory.
The diagram below shows my configuration.
Figure 3 – Network Topology
That’s the network, you now know what it’s capable of, so, onto your actual server – what throughput is it capable of in terms of writing to the transaction log?
Well, thats the end of part 1 - any questions? What do you expect or want to know for part 2 and (3)??