June 2007 - Posts

Not thinking it through – the post mortem

I just thought I’d round off my series of posts concerning my first experience of an outsourced data centre.

 

http://sqlblogcasts.com/blogs/grumpyolddba/archive/2007/01/04/not-thinking-it-through-the-saga-continues.aspx

 

It really got to the stage of almost disbelief which is why it ended.

From the previous posts you’ll remember the initial issue arose around the re-indexing of the production database each week, and the state the database was left.

Well we never did resolve that:-

  • We switched the database to simple recovery model to rebuild the indexes in a series of batches, simple recovery making sure the log didn’t grow and grow, sadly the tables were not batched in such a way to ensure we didn’t run out of disk space so occasionally this still happened – some of the largest tables started with the same keyword.
  • There was no check in the job to switch the database back to full recovery in event of an error or to make sure a full or differential backup was taken prior to re-starting the transaction log backups.
  • If disk space was low then the full backup failed and the database was not recoverable until the next backup, a full 24 hours and a working day later, this was an international application so a day was actually 24 working hours.
    • If you switch to simple, you need at least a differential backup to be able to recommence transaction log backups to enable the recovery of your database.

Some other events of note;

  • I discovered during a routine hardware check that the redundant power supplies had been disconnected on two production servers.
  • I asked for a server re-boot, the data centre couldn’t find the server in the racks and phoned me to ask its location – not having been in the data centre – no idea.
  • The data centre turned off a reporting server; it took them nearly three days to find it again to switch it back on.
  • They didn’t know how to detect or measure awe memory on a windows server.
  • There was an issue with server to server file copies, “path too deep” if you want to check. We established that the only way to move a database backup for a release was to use a memory stick – sadly the data centre were unable to provide resource so a member of management had to drive a couple of hundred miles very early in the morning to do this.
  • The actual file copy issue was very interesting and I don’t know if it was ever resolved, but basically there were inconsistent copy times when moving the same file from server to server in the data centre – and I mean serious differences – from a couple of seconds to generally around 40 mins, worst case a couple of hours, this also manifested itself on back and forward copies, so copy from A to B = 2 seconds, from B to A = 30 mins. Cool huh?
  • Alerts were dealt with on the basis – mark as fixed if when you check the alert, which could be hours later, the event is not there. E.g. high cpu etc.
  • Disk free space was marked to alert on 1 mb free remaining – not too sure who set this threshold.
  • When a production database ran out of log space the solution was to stop the server, delete the log and attach creating a new log.

 

Maybe I was just unlucky, but it’s probably fair to ask “who watches the watchers?”  and to not assume that because it’s a large data centre things will actually be good.

Posted by GrumpyOldDBA with no comments

SQL Community is good to be part of.

It’s interesting to sometimes compare the flow of information about SQL Server to that available for other products.

I’m currently working on an application which runs on SQL Server but by default / was developed with an ISAM back end.

This particular application is very important for my client so discussions about how good or bad it is are irrelevant; the critical point is to tune it.

It’s been interesting trying to obtain information for the application, the user forums for this product don’t have the same community feel about them as SQL Server and the vendor web site is only available to those third parties that sell the product, not the end users.

I’ve found the user forums really unhelpful with the general response to questions to be to not answer the question and to tell you to hire an “expert”, in my case it means I have to come to terms with the application, for its sins it handles sql server indexes and such, if you’ve encountered one of these type of apps I’m sure you’ll understand what I mean, if not then basically the application provides an interface to manage the indexing – fine you might think – well not always – generally these apps may drop any indexes they find in the database which are not in the application or worse still they append extra columns to the index without telling you – to enforce uniqueness.

There are, of course, a number of “experts” for the application and they usually seem in this case to be only willing to sell themselves as consultants and sell you products which duplicate functionality in SQL Server, well I’m a cynical as well as grumpy! One such consultant advertises to guarantee to improve performance 100% - hmm – if only we could all make that claim!

I worry when reading recommendations, especially when they include turning off auto update stats and create stats ( this is an oltp type app ) . Another recommendation is to turn off parallelism although it’s claimed the application doesn’t suffer from parallel scans. It’s also claimed the application doesn’t use tempdb, well it’s an interesting viewpoint, I have statistical data gathering installed – see http://sqlblogcasts.com/blogs/grumpyolddba/archive/2006/11/28/p-0-creating-a-baseline-part-3.aspx

and I can see that at times tempdb is handling around one third of the total throughput on the server, it’s dedicated to the app btw.

I wonder sometimes how these conclusions are drawn, I had the opportunity to tune a database which didn’t have clustered indexes – “ If you put clustered indexes on the database it breaks ” – eventually I got to add clustered indexes, but it took a long time and there was nothing documented as to where this conclusion had been drawn. Maybe it’s how urban myths are born! Oh and it didn't break, just went faster.

 

So in conclusion I just want to say thanks to all those individuals from the community and Microsoft who take the time to share their knowledge and help us, my only problem is trying to keep up with it all,  just be glad you don’t encounter some of the applications from the dark side < grin >

 

Ps. I feel so much better after that !

Maxdop hint for Views

With the advent of multi core processors the issue of parallelism is likely to cause more problems than before – the usual posted answer on forums to resolve parallelism ( CXPACKET ) locks is to disable or severely restrict processors.

I’m sure this is a discussion that could run and run but I’m not concerned with this just now, my personal stance is that I dislike processor restrictions, I liken it to buying a V8 engine and removing 4 spark plugs to keep the speed to 70mph!

 

However – Views – bane of my life , well complex views, difficult to tune, nearly always seem to generate the query from hell, very prone to parallelism blocking, on an 8 way box I’ve had over 70 threads generated from a view, most times taking the query out of the view, and adding a maxdop statement will resolve performance – only you can’t put the maxdop statement in the view.

 

The issue with parallelism is that you can very easily starve yourself of threads, if you have concerns then run   dbcc sqlperf(umsstats).

 

I’ve seen the number of worker threads increased to try to resolve this but unless there’s been a change, increasing the worker threads can also cause problems as I believe a process has to “round robin” all available workers before execution. ( SQL 2005 of course handles workers dynamically - however I'm sure there has to come a point where the cpu's throw in the towel ! )

 

So assuming you can’t escape from the view, because the application can’t call a stored procedure, what can you do?

 

Well in SQL 2000 nothing, but in SQL 2005 there are plan guides, I’m not going to post lots of sample code, mainly because I don’t have an 8 way server to hand or a complex view which illustrates the issue, but here’s how you attach a maxdop statement to dynamic sql  / view.

 

For those DBA’s supporting applications which generate dynamic sql which you can’t alter in the application, this is also a way to add hints, rather than crippling your nice multicore server!

 

here's an example plan script

 

exec dbo.sp_create_plan_guide

@name = N'PlanGuide_101',

@stmt = N'SELECT STATEMENT',

@type = N'SQL',

@module_or_batch = NULL,

@params = N'@P1 varchar(20)',

@hints = N'OPTION (MAXDOP 1)'

Go

 

The subject of plan guides is well covered in BOL and there are further enhancements to their use in SQL 2008.

Posted by GrumpyOldDBA with no comments