July 2007 - Posts

A transaction log is broken up into VLF’s (Virtual Log Files (not 'Fragments' as I state - thanks Kalen)), these files, I'll call them fragments from now on because you may get confused with the log file itself) vary in size and in SQL Server 2000 the number and size of them can have a big impact on the performance of various operations that involve the transaction log, specifically here I talk about Triggers and Log Backups. In SQL 2005 they appear to have fixed the performance problem that is currently in SQL 2000.

 

By default autogrow for SQL 2000 is 10% and the initial size is 1Mbytes, that means if your transaction log starts to grow you will end up with a lot of VLF’s. You can use the DBCC LOGINFO command to see how many log fragments you have.

 

The script below can be used to highlight the performance difference caused by an increased number of log fragments, note, in a real system the performance differential will probably be higher because of concurrent access to the transaction log.

 

Results I received here, I’m only going to show SQL 2000 because 2005 there was no difference:

 

                                                Many VLF’s        Few VLF’s

Backup Transaction Log:            27 seconds        22 seconds

Update with Triggers:                 98 seconds        28 seconds

 

As you can see, even on this small amount of data, single user there is a big difference in performance and especially on trigger performance.

 

My recommendation to anybody still on SQL 2000 is to a) size your log properly that means according to your load, you will have this already and b) fix autogrowth, don’t use 10% use something sensible.

 

One thing to watch on both SQL 2000 and SQL 2005 which makes even more important to size your log properly so you don’t suffer autogrowth is that while the log is ‘growing’ then it is essentially locked, any processes trying to do insert/update/delete activity will block until the growth has completed. Yes, this is true for SQL 2005 as well because Auto File Initalisation does not work for the log file – it must initialise the log structure.

 

If you find you have many  VLF’s then you can fix the problem by doing this (it may take a few goes because of the placement of the activity portion of the log which is denoted by a status of 2 in the results of DBCC LOGINFO)..

 

BACKUP LOG...

DBCC SHRINKFILE( 2 )

 

Now – size your log properly.

 

Although this can be done whenever, its best done when nothing is using the log.

 

BTW, one thing I nearly forgot to mention; with autogrowth you can also get OS file fragments – so that one log file can be scattered across your disk in hundreds of places and not in order which can be bad for something that does sequential processing! Another good reason to size your log properly to start with.

 

use master

GO

DROP DATABASE VLFDemo

GO

 

CREATE DATABASE [VLFDemo] ON  PRIMARY

      ( NAME = N'VLFDemo',

        FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\VLFDemo.mdf' ,

        SIZE = 10MB ,

        FILEGROWTH = 10%)

 LOG ON

      ( NAME = N'VLFDemo_log',

        FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\VLFDemo_log.ldf' ,

        SIZE = 1MB ,

        FILEGROWTH = 1024KB )

GO

--CREATE DATABASE [VLFDemo] ON  PRIMARY

--    ( NAME = N'VLFDemo',

--      FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\VLFDemo.mdf' ,

--      SIZE = 300MB ,

--      FILEGROWTH = 10%)

-- LOG ON

--    ( NAME = N'VLFDemo_log',

--      FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\VLFDemo_log.ldf' ,

--      SIZE = 1GB ,

--      FILEGROWTH = 10MB )

--GO

 

ALTER DATABASE VLFDemo SET RECOVERY FULL

GO

 

BACKUP DATABASE VLFDemo TO DISK = 'C:\VLFDemo.BAK' WITH INIT

GO

 

USE VLFDemo

GO

 

--    Show VLF's

DBCC LOGINFO

GO

 

--    Create some work

--DROP TABLE my_table

--GO

 

CREATE TABLE my_table (

      id                int               not null identity primary key clustered,

      some_data   char(1024)  not null

)

GO

 

SET NOCOUNT ON

DECLARE @i int

SET @i = 1

 

WHILE @i <= 200000

BEGIN

      INSERT my_table ( some_data ) VALUES( CAST( @i AS char(1024) ) )

      SET @i = @i + 1

 

END

GO

 

DBCC LOGINFO

GO

 

BACKUP LOG VLFDemo TO DISK = 'C:\VLFDemo.TRN' WITH INIT

GO

 

CREATE TRIGGER trg_my_table_modify ON my_table FOR UPDATE

AS

BEGIN

      UPDATE mt

            SET some_data = RTRIM( d.some_data ) + RTRIM( i.some_data )

      FROM deleted d

            INNER JOIN inserted i ON i.id = d.id

            INNER JOIN my_table mt ON mt.id = i.id

 

END

GO

 

SET NOCOUNT ON

DECLARE @i int

SET @i = 1

DECLARE @now datetime

SET @now = CURRENT_TIMESTAMP

 

BEGIN TRAN

 

WHILE @i <= 200000

BEGIN

 

      IF @i % 100 = 0

      BEGIN

            COMMIT TRAN

            BEGIN TRAN

 

      END

 

      UPDATE my_table

            SET some_data = 'X'

      WHERE id = @i

 

      SET @i = @i + 1

 

END

 

COMMIT TRAN

 

PRINT DATEDIFF( second, @now, CURRENT_TIMESTAMP )

GO

 

Slightly mis-titled, because whilst this is a great resource for experienced SQL Server folks like myself it's also really good for folks with a more modest skillset. The explanations and examples are really good. Both Hugo Kornelis and Lara Rubbelke have contributed to this great resource - I recommend this one to your shelf or hard drive in my case because you can get it as an ebook (http://www.apress.com/book/bookDisplay.html?bID=10220).

Security is my least favourite topic in SQL Server, even the word makes me want to sleep...... what was I saying, I seem'd to drop off then. There is a really good chapter on the new security stuff in 2005. Another great chapter on Trees, Hierarchies and Graphs.

But the topics are more tuned to every day development, the problems we face in the here and now. What ever your level you will find this book useful.

I don't often comment on books because I seldom have time to read, and as my wife would say - there aren't enough pictures in technical books, but this one is worth mentioning.

Tony.

 

Below are links to the PPT deck, ASP.NET example for Query Notifications and SQL stuff for the presentation I've just given at the Vertex Conference in Cheltenham.

Cheltenham 17 July 2007 - ASP.NET Query Notifications Example

Cheltenham 20070719 - PPT Deck used, Hints / Tips + Query/Event Notifications

SQL Demos for Cheltenham on 17 Jul 2007

Session Abstracts ->

Before SQL Server 2005 we had to poll SQL Server for state change, now we have Notifications (Events and Query), these are probably one of the biggest but under used features in the product; in Part 1 of the presentation I show you what they are and how to use them.

Part 2 is a more general SQL session on an important and little understand part of the product – the execution engine (how stuff gets executed), that will be followed by a hints and tips and general Q & A session.

 

Hi all,

Thanks to all those that attended last nights User Group event in Reading with presentations from Kevin Kline and Iain Kick.

Kevins slides can be got here: http://www.sqlserverfaq.com/controls/kbase/store/KB_40_KevinKlineUKUGReading20070717.ppt

Thanks go to Microsoft and Quest for supporting this event.

Some good issues cropped up for which I'll try and get round to blogging about -

  • Which is best? A log with lots of VLF's or one with few?
  • Sizing the MDF / LDF's and autogrowth.
  • Proportional Fill Algorythm.

Thanks again to Kevin and Iain for great presentations.

The trip home was a long one, 3 sets of single lane restrictions on the M25 and M1; luckly I had Simon Sabin for company and we got chatting - we want to start the breifings I used to do through LiveMeeting again; look at for these in August, got some really good other ideas - also, pencil 6th Oct in for our SQLBits conference which you'll here about soon.

Tony.

Kevins Blog: http://sqlblog.com/blogs/kevin_kline/default.aspx

 

Posted 18 July 2007 13:06 by tonyrogerson | with no comments
Filed under:

Colin Leversuch Roberts has posted up a really useful (and big) list of useful queries, advice and notes which is a must read for anybody who wants or needs to better understanding their own indexing strategy and usage. It's based around his experience with really big volume/size tables on SQL 2005.

For a good read see it: http://sqlblogcasts.com/blogs/grumpyolddba/archive/2007/07/12/analysing-indexes-part-1.aspx

Tony.

On occaison you will need to prioritise the order, for instance, if you are ordering on a CompanyName and that column is NULLable or can be an empty string if you ORDER BY CompanyName you will get all the NULL's and empty strings first which might not be want you want. Other stuff you might want to do are a dynamic ORDER on a column or ASC/DESC based on a parameter to a stored procedure.

Anyway, a way round this:

select companyname

from registrations

where country = 'GB'

order by case when companyname = '' then 1 else 0 end, companyname

It's simple really, the CASE expression provides logic that alters the ordering.

If you are making things more complex, for example changing the order by according to a passed parameter...

declare @col_order sysname

 

set @col_order = 'idregistrations'

 

select top 100 companyname

from registrations

where country = 'GB'

order by case when @col_order = 'companyname' then companyname else idregistrations end, country

The above works because the resulting data type of the CASE will be bigint because idregistrations is bigint and you are ordering by bigint.

declare @col_order sysname

 

set @col_order = 'idregistrations'

 

select top 100 companyname

from registrations

where country = 'GB'

order by case when @col_order = 'companyname' then companyname else idregistrations end, country

The above does not work, it gives this error...

Msg 8114, Level 16, State 5, Line 6

Error converting data type varchar to bigint.

Why, strange - looks fine doesn't it and values from idregistrations can not be obtained; however, the resulting data type is still bigint because bigint takes precendence over varchar.

You are actually trying to do this..

select 'abc' + 12345


Msg 245, Level 16, State 1, Line 1

Syntax error converting the varchar value 'abc' to a column of data type int.

The fix is to either a) use dynamic SQL or b) write it like this...

declare @col_order sysname

 

set @col_order = 'companyname'

 

select top 100 companyname

from registrations

where country = 'GB'

order by case when @col_order = 'companyname' then companyname else null end,

         case when @col_order = 'idregistrations' then idregistrations else null end,

         country

You can also add ASC and DESC ....

declare @col_order sysname

 

set @col_order = 'companyname'

 

select top 100 companyname

from registrations

where country = 'GB'

order by case when @col_order = 'asc' then companyname else null end ASC,

         case when @col_order = 'desc' then companyname else null end DESC,

         country

The power of CASE is amazing - make sure you know it inside out!

 

Registration is at 5.30, evening will commence at 6pm and finish 9pm.

Agenda, abstracts and timings will be expanded on shortly.

These evening events are fantastic opportunities to meet with and discuss problems/solutions with other like minded SQL professionals right the way through the skills range to leading industrial experts.

Places are limited so reserve your place now at http://sqlserverfaq.com/?eid=98.

Kevin Kline

Quest Software

 

Manage the Growing World of SQL Server

With the evolution of IT, organisations are looking more and more to their SQL Server platform to support mission-critical applications and business intelligence.  This means an unprecedented volume of data is stored in SQL Server databases.

How are you dealing with the flood of data?

 

Join leading SQL Server expert and SQL Server MVP Kevin Kline at this evening seminar to discuss the implications of managing the ever-growing volume of data in SQL Server and

get recommendations on ensuring availability and performance in your environment.

 

 

Iain Kick

Quest Software

 

Do you know how change affects your database performance?

Do you know when poor performance occurs on your SQL Server server?

 

When someone in IT makes infrastructure changes, you don’t always know about them.  So if these changes affect performance of your database or SQL Server server, how do you start fixing them?

 

Quest Software can help.  Its new solution, Performance Analysis, provides deep diagnostics and intelligent analysis of the SQL Server database.  Ian will cover the key features of this great tool.

Posted 09 July 2007 00:11 by tonyrogerson | with no comments
Filed under:

Two of the biggest memory eaters of SQL Server cache are Data and Procedure Cache, Data cache offers us the advantage of hopefully not having to go to the disks to get our data because we already did that and its cached. One of the problems on real systems is that the procedure cache can grow really big and take data cache away resulting in you having to go to the disk more, one example is a client of mine I do support for, they are on SQL 2000 standard so that limits our memory to 1.7GB, the procedure cache alone was taking in the region of 700Mbytes. You can’t directly control how big the Procedure Cache grows but by good application design we can influence it.

How do we influence Procedure Cache growth? The answer is simple; we use Parameterisation and/or using Stored Procedures.

Let’s look at two examples; they need to be run on a SQL Server that nobody else is using so as not to influence the results.

dbcc freeproccache

go


declare
@t datetime

set @t = getdate()

 

set nocount on

 

declare @i int

declare @f int

declare @sql nvarchar(max)

set @i = 0

 

while @i <= 100000

begin

    set @sql = 'select @f = count(*) from sys.objects where object_id = ' + cast( @i as varchar(10) )

    exec sp_executesql @sql, N'@f int output', @f output

 

    set @i = @i + 1

 

end

 

print datediff( second, @t, current_timestamp )

Essentially every query I execute is different because I’m building my SQL string using constants rather than using parameters.

Let’s take our memory readings:

dbcc memorystatus

CACHESTORE_SQLCP (Total)     KB

---------------------------- --------------------

SinglePage Allocator         1178608

 

Procedure Cache                Value

------------------------------ -----------

TotalProcs                     18286

TotalPages                     148794

InUsePages                     41

Buffer Distribution            Buffers

------------------------------ -----------

Stolen                         2592
Free                           7191
Cached                         150151
Database (clean)               101300
Database (dirty)               910
I/O                            0
Latched                        0

In summary, the script took 371 seconds to run, the procedure cache is 1.1GBytes, there are 18,286 plans in cache, and there are 7191 pages (57Mbytes) free for the data cache.

Let’s do the same test (logically) but parameterise it.

dbcc freeproccache

go

 

declare @t datetime

set @t = getdate()

 

set nocount on

 

declare @i int

declare @f int

declare @sql nvarchar(max)

set @i = 0

 

set @sql = 'select @f = count(*) from sys.objects where object_id = @i'

 

while @i <= 100000

begin

    exec sp_executesql @sql, N'@f int output, @i int', @f output, @i

 

    set @i = @i + 1

 

end

 

print datediff( second, @t, current_timestamp )

go

Use DBCC MEMORYSTATUS to take the readings again...

CACHESTORE_SQLCP (Total) KB

------------------------ --------------------

SinglePage Allocator     768

 

Buffer Distribution            Buffers

------------------------------ -----------

Stolen                         938

Free                           157450

Cached                         1546

Database (clean)               101300

Database (dirty)               910

I/O                            0

Latched                        0

 

Procedure Cache                Value

------------------------------ -----------

TotalProcs                     15

TotalPages                     183

InUsePages                     13

In summary, the script took 5 seconds to run, the procedure cache is 768Kbytes, there are 15 plans in cache, and there are 157450 pages (1.2GBytes) free for the data cache.

Surprised? It’s one of the reasons DBA’s keep bleating on about using stored procedures, it forces the mindset to use the procedure cache more effectively. By parameterising, but preferably using stored procedures we get plan reuse which means a) less compiles thereby reducing CPU load, b) more available pages for the data cache thereby reducing physical disk IO and c) DBA’s are always right ;).

Check out your own procedure cache size and see how big it is in proportion to the data cache.

How do you find these non-parameterised queries? You can look in master..syscacheobjects.

select *

from master..syscacheobjects

 

For our last test, the parameterised one you will find a single Compiled Plan that has a usecounts of 100,001; whereas you check the first version you will see thousands of Compiled Adhoc Plans with a usecounts of 1.

Another method is to use profiler and see what queries are embedded or dynamically built and not parameterised.

We are stuck with this design decision at the moment, you can’t control the size of the procedure cache, you can only influence it with application design. In a real system it's not going to be clear cut like my example, in reality the size of the data cache and procedure cache will depend on what is going on; for instance, a lot of those one off plans would have been kicked out of cache and their pages reused for data, but, that all takes resource!

 

In ANSI 92 there are two styles of writing SQL, there is what is commonly referred to as the old or ANSI 89 style, that is the style where you put the join criteria in the WHERE clause; the other style is referred to either ANSI 92 or the new style and it is where you put the join criteria in the FROM clause using INNER JOIN, OUTER JOIN etc..

Coming fresh from yet another battle with that Joe Celko bloke from the public news groups (people who worship him should read no further for I am about to blaspheme) I thought I’d write up this entry. It appears that anybody who writes using the new ANSI 92 style is now a ‘Cowboy Coder’. Wow, well – I guess that means the majority of the developers in IT, me, a few world experts, anybody who has written a book on SQL Server and the doc writers of Books Online are all Cowboy Coders. Thanks to Itzik Ben Gan who has obviously too much time on his hands....

It all starts with Best Practice; in Books Online it clearly states that the recommended style is to use the new ANSI 92. Apparently in DB2 there are some optimisation problems with using the new style (yet to be confirmed, I’m sceptical).

Celko states that “Best Practice is not local, but global”, oh, and apparently I missed the “entire Software Engineering and Structured Programming revolution”, oh, and I'm a Barbarian....

Think about that statement, it's at odds with a MS recommendation and how we all work; Celko’s personal best practice is to use the old ANSI 89 style, but the MS SQL Server best practice is to use the new ANSI 92 style; what do you choose? Apparently choosing the MS SQL Server best practice, you know – that product we all code against would relegate us to being a “Cowboy Coder” for using “local and immediate solutions”.

Oh please, what is this guy on?

Best practice being a global thing is actually right, but local recommendations need to override because they have come from the vendor who made the product and it’s a best practice or recommendation. In life it’s the same, local bylaws change global laws.

Ok, why am I so darn angry about this guy and his stance; well, unfortunately he’s one of those internet bully blokes who likes to dismiss and put down newbies or people that at times struggle with SQL; but it’s more fundamental than that, the majority of folks use the ANSI 92 new style coding, in fact some dev’s will never have seen the old syntax, the new syntax was introduced in SQL Server 6.0 many moons ago. Like many, on seeing the old syntax I always rewrite to the new to clearly understand what is going on (it usually takes a while too!) – the logic behind the query, the join relationships etc... In the old syntax the logic etc... was buried in the WHERE clause, remember the days of forgetting the join condition and causing a Cartesian product?

Anyway, let’s have a look at some styles....

Old ANSI 89 style would go something like this...

select *

from sys.objects as o,

     sys.columns as c,

     sys.sql_modules as sm

where o.object_id = c.object_id

  and sm.object_id = c.object_id

  and o.type_desc = 'VIEW'

The join criteria is mixed up in the WHERE clause, what you are actually doing is a cross join query and then restricting the results.

The equivalent new style ANSI 92 query is below

select *

from sys.objects as o

    inner join sys.columns as c on o.object_id = c.object_id

    inner join sys.sql_modules as sm on sm.object_id = c.object_id

where o.type_desc = 'VIEW'

 

 

The join criteria is distinctly separate from the filter, relationships are much clearer.

Formatting is another aspect that is key here...

select *

from sys.objects as o

inner join

sys.columns as c

on o.object_id = c.object_id

inner join

sys.sql_modules as sm

on sm.object_id = c.object_id

where o.type_desc = 'VIEW'

When arguing the syntax you often see Celko format the new style like the above, frankly it’s totally unreadable so I guess that’s where his real problem of migrating to the new syntax is.

I mean, would any of us format like this (sorry, I actually know a few who do – sorry folks, feel free to comment – I’m being adversarial)?

select *

from sys.objects as o,sys.columns as c,sys.sql_modules as sm

where

o.object_id = c.object_id

and

sm.object_id = c.object_id

and

o.type_desc = 'VIEW'

Again,
unreadable, it’s
the equivalent
of me
writing this
sentence
like I’ve
just done.

Crazy!

But one of the real problems with the old style syntax is that people where forever forgetting to put all the join conditions in the WHERE clause so we ended up with Cartesian products...

select *

from sys.objects as o,sys.columns as c,sys.sql_modules as sm

where o.object_id = c.object_id

  and o.type_desc = 'VIEW'

I can’t do that with the new syntax, I’d need to mix old and new...

select *

from sys.objects as o

    inner join sys.columns as c on o.object_id = c.object_id

    inner join sys.sql_modules as sm

where o.type_desc = 'VIEW'

The above gives an Invalid syntax error.

Another problem (thanks SQL Menace) is that if you write your query using the old syntax and you then need to make the inner join an OUTER JOIN you need to rewrite to the new style anyway.

To sum up, really you should all be using the new style, I know it’s a new style if you have come over from Oracle because I think it’s only been a couple of years that this syntax has been in the product, but learn it - get used to it; I remember I truely hated the new syntax until I got my head round it and now I really see the benefits.

And as for Celko, well, I think somebody coined a good phrase that is very appropriate – I think Celko is suffering from Unixitis which is a belief that something is better because it is more obscure and complex, thus making you morally superior because you understand it.

 

Leo Pasta gives a good tip on using RAISERROR and NOWAIT, in addition to this in SQL 2005 we can trap the 'User Error Message' EventClass both in SQL Profiler and using Event Notifications.

Here's how...

  1. Open Management Studio
  2. New Query and make a note of @@SPID
  3. Open SQL Profiler
  4. Select Blank trace template
  5. Select 'User Error Message' EventClass from Errors and Warnings.
  6. Filter on SPID = the @@SPID from (2) above
  7. Execute your RAISERROR..

    raiserror( 'hello there', 0, 1 ) with nowait

  8. That will now appear in Profiler!

 

Is that cool or what; I actually use this to good effect at a client, I've set it up using the Event Notifications, basically any error over a certain level I receive via email, so - when one of the developers (you know who you are Darren ;)) mispells an object name I know about it. It's really useful, think of the power of that!

Note, the User Error Message isn't limited to RAISERROR, it's not trapping that; it's trapping the message - so, when you get duplicate key messages etc... you'll get a message.

Good post Leo!