ER - She canny no take it Captain!

Posted 08 October 2007 16:06 by JohnParker

In Star Trek it was pretty obvious when the Enterprise had problems; it was always something to do with the Warp-Drive anyway.

On production systems, its normally SQL Server Enterprise having problems, but when your ship is going down, what should you check?  Heres a list, we've cobbled together.

  • Fire up Task Manager immediately.  Then come back to this after it's built up a history; when you've checked over the other usual suspects, come back to here and check out CPU Usage, if its too low, this could indicate a problem too as well as being very high.  Check out the memory usage.  Check out other jobs running on the server sorting  by CPU and CPUTime; add this counter in.
  • Are there any running SQL Server jobs hogging resource, or causing blocking, sort by status.  Run msdb..sp_help_jobhistory.  If there are a number of jobs, this could well be the culprit.
  • Check the space in the databases, maybe run DBCC UPDATEUSAGE if none reported or you dont regularly run this, check the free space of the data files as well as the transaction logs.  Is the Log too big?  Has backing up of the log stopped for some reason, this does affect performance.
  • Check the space on the physical drives; are any unreasonably low?  run master..xp_fixeddrives
  • Check the blocking?  Find the leadblocker?  Check sysprocesses or EM, is there anything suspicious in there, look for status like runnable and check for non-runnables too.  Look at the application name.  Are there any ad-hoc users connecting doing inordinate amount of work, check out cpu use and disk use.
  • Run Profiler using a template or standard - duration? how long are the queries taking?  Are there too many or not enough?
  • Check the SQL Server configuration, Is SQL using the correct amount of memory?  Check parallelism, priority boost and max worker threads.  Run sp_configure, has someone been changing configuration options, do they look normal? 
  • Are the database settings correct? Is Auto-Shrink enabled?  If so turn this off.
  • Check the statistics are up-to-date, is it being re-indexed as well, check the jobs that do this, are they running now causing problems?
  • If clustered server check cluadmin .  Are all the instances on the same node, any problems reported? Is the live db on the secondary node?
  • Check batch transactions per second in System Monitor, how busy is the SQL Server? 
  • Check all the Services running, are there rogue processes on the server?
  • Check EventVwr; anything suspicious recently matching the time of the performance issues.
  • Check SQL Server Error Logs
  • Fire up System Monitor or Perfmon, check CPU, Memory, DISK and Network.  Save a standared set of perfmon counters.  Check against your baseline.
  • Can you run a query on the server?
  • Can you query the server from a client?  Are the ODBC or OLE-DB connections working?
  • Is the Service Pack level up-to-date, OS and SQL?
  • When was the server last rebooted?  in cmd, run Net statistics server.  Does the server need re-booting?
  • Is the server set to Named Pipes, instead of TCP/IP?
  • Are there any internal audit tables holding performance stats or application functions chosen, query these.
  • Has someone recently done an upgrade to OS/SQL or application?

Any more for any more?  I'll add your name if you have a counter argument to any of the above or a new one to check.  I guess I've only just scratched the surface.

 Contributors;

  • John Parker
  • Richard Smith

Comments

# re: ER - She canny no take it Captain!

08 October 2007 22:07 by anil

Hi John

Nice one, surely we can add more to document if we scratch our brains.

Thanks

Anil

# re: ER - She canny no take it Captain!

08 October 2007 22:18 by anil

Just rememberred ..

• Third party back up to the tape. As sometimes backup might overrun and may be backing up C drive of the database server causing  server to respond slowly.

• Another trick is copying the database to disk and then copying to tape from the copy. This cuts the time the primary database is slowed down by backing up

Please comment

Anil