When does 2+2 != 4 ?

Disclaimer: Results and observations are from a specific series of tests using sql 2008 enterprise on windows 2008 enterprise. I've been working on a series of benchmarking tests for an application with the aim of showing scaling of the storage for future growth.Essentially we were attempting to prove that adding extra spindles to the storage pool would give extra performance. The one interesting point that emerged was that the initial test runs were throttled by the bandwidth of the fibre channel switch.The initial setup was through two 4GB HBAs using mpio through a 2GB switch into two 4GB nodes on the SAN, I'd actually missed the fact it was a 2GB switch. We ran a series of tests with some different disk setups but results were strangely very similar and it was only a set of very extended times for a run of one test led us to discover that the switch ports had actually been swamped.It was at this point we realised we had a 2GB switch and not a 4GB switch and that our tests were actually being bottlenecked by the fibre bandwidth. To get the full bandwidth we decided to remove the switch and connect direct, but the initial HBA setup was in failover so we still only had 4GB bandwidth, however what was interesting was that we were able to drive more iops and test times dropped. To summarise:
  1.  If you think you may have performance issues on your fibre attached storage check that your route is actually at the speed you think it is
  2.  Monitor your switch ports to make sure you don't have a bottleneck
  3. In my tests 2 x 2GB did not give the same performance as a single 4GB connection
  4.  If your fabric is shared into the storage from your database server make sure you do 2) above. 
To give a feel for the differences one of the tests was a database restore ( 55GB database ) with the 2 x 2GB connects we had a time of  5.5 minutes, 1 x 4GB dropped this to 3.5 minutes and 2 x 4GB dropped this to 2.5 minutes ( times rounded for simplicity ).( NB. The actual performance of a database restore depends upon more than just the fc bandwidth, however this is still a valid observation. ) ( The actual server was a 16 core intel with 26gb ram allocated to sql server, the storage subsystem was all raid 10 - I will blog further concerning the vendor and setups we tested and the monitoring tools used to gather this information ) 


Published Wednesday, October 21, 2009 10:57 PM by GrumpyOldDBA
Filed under: , ,


No Comments