June 2006 - Posts

There are times when we are faced with multiple concurrent connections (running something at the same time against the same tables) and some of the connections may need to insert the same data (same key or unique index). The business logic would be, if the row doesn't already exist then insert it otherwise get the surrogate key id or something similar.

Our table for the examples to follow :-

CREATE TABLE Users (
   logon_name varchar(50) not null
         constraint pk_Users PRIMARY KEY,
   some_data int not null
)

Now our logic:-

declare @one_to_add varchar(50)
set @one_to_add = 'Trevor Dwyer'

BEGIN TRAN

IF NOT EXISTS (
      SELECT
*
      FROM
Users
      WHERE logon_name = @one_to_add

   )
BEGIN
      INSERT Users ( logon_name
)
            VALUES( @one_to_add )

END

COMMIT TRAN

In the above example we check for the existance of the row in Users and if it doesn't already exist we insert it. Whats wrong with this?

The SELECT * FROM Users bit will take shared locks for the duration the query runs, the locks aren't held even though we are in a transaction, worse still - there is not a row to lock because we are probably going to be inserting it.

Now to some tests, in order to provide a realistic test we must account for connections that manage to run the NOT EXISTS at the same time so get the same result (the row doesn't exist) and therefore both try the INSERT - because there is no blocking locking then that can happen, but you will only notice it once you ramp volumes up! To simulate this condition we must use a WAITFOR DELAY, saves us reproducing 'x' number of concurrent connections...

Open up two query windows and run the following SQL in each.

declare @one_to_add varchar(50)
set @one_to_add = 'Trevor Dwyer'

BEGIN TRAN

IF NOT EXISTS (
      SELECT
*
      FROM
Users
      WHERE logon_name = @one_to_add

   )
BEGIN

      WAITFOR DELAY '00:00:30'

      INSERT Users ( logon_name )
            VALUES( @one_to_add )

END

COMMIT TRAN

TRUNCATE TABLE Users

One of the connections will fail with the error:-

Msg 2627, Level 14, State 1, Line 16

Violation of PRIMARY KEY constraint 'pk_Users'. Cannot insert duplicate key in object 'dbo.Users'.

Oh dear! Now, unless you had done a thorough scalability test with lots of concurrent users you would not have trapped that 'timebomb' problem.

So, lets look at how we can fix this concurrency problem.

Remember, the problem occurs when the row does not exist and because of the volume of connections trying to insert the same key / unique index values. Essentially we need to serialise the INSERT into the table, two options for creating blocking locks are XLOCK and HOLDLOCK locking hints; the XLOCK takes an exclusive lock on the row, actually XLOCK is no good in this situation because the row doesn't exist so how can you lock a row that doesn't already exist! Ok, what about HOLDLOCK? Well, HOLDLOCK will take a range lock so that's fine, but what happens?

Run the SQL again in two connections (notice we use HOLDLOCK) and check the output from sp_lock.

declare @one_to_add varchar(50)
set @one_to_add = 'Trevor Dwyer'

BEGIN TRAN

IF NOT EXISTS (
      SELECT
*
      FROM
Users WITH ( HOLDLOCK )
      WHERE logon_name = @one_to_add

   )
BEGIN
      INSERT Users ( logon_name
)
            VALUES( @one_to_add )

END

COMMIT TRAN

Anyway, one of the connections will fail, this time with a worse error :-

Msg 1205, Level 13, State 48, Line 14

Transaction (Process ID 57) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Ouch! Bit more dramatic than the violation of PRIMARY KEY error, so HOLDLOCK is not the way forward on this either.

What have we left? Remember - we need to serialise the INSERT; the transaction isolation level has no affect here either, the only way I've found to get round this problem is to use sp_getapplock.

The sp_getapplock stored procedure allows us to use our own locking, we get a choice of locking types, we want Exclusive. The script now becomes...

** Remember the WAITFOR is simply there to simulate concurrent users entering the code path before the INSERT.

declare @one_to_add varchar(50)
set @one_to_add = 'Trevor Dwyer'

BEGIN TRAN

IF NOT EXISTS (
   
SELECT
*
   FROM
Users
   
WHERE logon_name =
@one_to_add
)

BEGIN
   WAITFOR DELAY '00:00:30'

   DECLARE @result int
   EXEC @result = sp_getapplock @Resource = 'Unique Label for this INSERT', @LockMode = 'Exclusive'

   IF @result NOT IN ( 0, 1 )   -- Only successful return codes
   BEGIN
      PRINT
@result
      RAISERROR ( 'Lock failed to acquire.', 16, 1
)
   END
 
   ELSE
   BEGIN
      IF NOT EXISTS
(
         SELECT
*
         FROM
Users
         WHERE logon_name =
@one_to_add
         )

         INSERT Users ( logon_name, some_data )
            VALUES( @one_to_add, 1 )

      EXEC @result = sp_releaseapplock @Resource = 'Unique Label for this INSERT'

   END

END

COMMIT TRAN

TRUNCATE TABLE Users

We need the second NOT EXISTS to catch the occaison where by the time we got into this code path but before we could take the application lock somebody has already been in and inserted the row.

Remember the lock is still part of your transaction so if you expand the logic out you may end up shooting yourself in the foot and creating deadlocks, for example - put the HOLDLOCK hint back on the NOT EXISTS query and you'll see what I mean, the getapplock returns a -3 which indicates it couldn't acquire the lock because its been chosen as a deadlock victim.

Just a note, sp_getapplock must be called inside an open transaction, if there is no transaction active then it returns an error.

Another note, and thanks to Chris for this - a return code of 0 or 1 means a successful lock acquired, I was only checking for 0 - the post has been modified.

Having spent around half an hour of my life on the phone to Vodafone technical support only to be passed onto consumer sales to have my contract upgraded when I already have 3G data - jesus, the quality arrrrr....

Anyway, found this excellant site (eventually after many hours of trying to work this out): http://www.filesaveas.com/gprs.html.

It lists all the major phone company connection settings for GPRS.

For the QTEK 9000 the 3g/gprs connection settings are, da da....

   Modem: Cellular Line (GPRS, 3G)
   Access Point Name: internet
   User name: web
   Password: web
   Domain: 212.183.137.12
   Advanced (all default settings)

Oh, and on the GPRS- Setting thing in Settings it should be PAP.

Now to get Exchange push working :)

The 'SQL standard' way of doing it...

SELECT *
FROM INFORMATION_SCHEMA.
ROUTINES
WHERE ROUTINE_TYPE = 'PROCEDURE'

SELECT *
FROM INFORMATION_SCHEMA.
ROUTINES
WHERE ROUTINE_TYPE = 'FUNCTION'

SELECT *
FROM INFORMATION_SCHEMA.
TABLES
WHERE TABLE_TYPE = 'BASE TABLE'

SELECT *
FROM INFORMATION_SCHEMA.
TABLES
WHERE TABLE_TYPE = 'VIEW'

The SQL Server 2000 way of doing it...

SELECT * -- Stored Procs
FROM
sysobjects
WHERE objectproperty( id, N'IsMSShipped' ) =
0
  AND objectproperty( id, N'IsProcedure' ) = 1

SELECT * -- Functions
FROM sysobjects
WHERE objectproperty( id, N'IsMSShipped' ) = 0
  AND ( objectproperty( id, N'IsTableFunction' ) = 1
     OR objectproperty( id, N'IsScalarFunction' ) = 1 )

SELECT * -- User tables
FROM sysobjects
WHERE objectproperty( id, N'IsMSShipped' ) = 0
  AND objectproperty( id, N'IsTable' ) = 1

SELECT * -- Views
FROM sysobjects
WHERE objectproperty( id, N'IsMSShipped' ) = 0
  AND objectproperty( id, N'IsView' ) = 1

The SQL Server 2005 way of doing it is the same as 2000 but we use schemas more, sys.objects instead of sysobjects, oh and object_id and not id...

SELECT * -- Stored Procs
FROM sys.objects
WHERE objectproperty( object_id, N'IsMSShipped' ) = 0
  AND objectproperty( object_id, N'IsProcedure' ) = 1

SELECT * -- Functions
FROM sys.objects
WHERE objectproperty( object_id, N'IsMSShipped' ) = 0
  AND ( objectproperty( object_id, N'IsTableFunction' ) = 1
     OR objectproperty( object_id, N'IsScalarFunction' ) = 1 )

SELECT * -- User tables
FROM sys.objects
WHERE objectproperty( object_id, N'IsMSShipped' ) = 0
  AND objectproperty( object_id, N'IsTable' ) = 1

SELECT * -- Views
FROM sys.objects
WHERE objectproperty( object_id, N'IsMSShipped' ) = 0
  AND objectproperty( object_id, N'IsView' ) = 1

So, which one do I prefer; well its certainly not the INFORMATION_SCHEMA views because they don't scale; if you need to use this type of thing inside stored procedures and performance is a factor then use sysobjects or sys.objects!

Often people forget or misunderstand the difference between UNION and the UNION ALL keywords in a query.

UNION ALL

SELECT NOW = GETDATE()

UNION ALL

SELECT NOW = GETDATE()

This gives us the entire 'set' of data, both queries are executed, note, they are executed serially and not in parallel, the order the queries are executed is not determined; so, you never get a situation where the top query is parallelised with the bottom query; the query components themselves may be executed in parallel and combined as the last step.

Output (notice there are two rows that are identical, this is because GETDATE() gives a consistent value across the entire query rather than consistency at each query within the UNION construct.

NOW
-----------------------
2006-06-29 08:01:48.937
2006-06-29 08:01:48.937

Say you don't want duplicates, well that's where the [ALL] option on the UNION came in, [ALL] gives all rows from all queries combined, whereas leaving the [ALL] out will do a distinct as the final stage on the result of all the sub-queries:-

UNION

SELECT NOW = GETDATE()

UNION

SELECT NOW = GETDATE()

Output (notice there is just a single row now).

NOW
-----------------------
2006-06-29 08:04:37.107

There are a number of ways of doing a DISTINCT on the final results set and it depends what query you are executing, some ways the optimiser uses are Merge Join (UNION), Stream Aggregate (Aggregate) with a preceeding Sort step.

ORDER BY

The ORDER BY only operates on the outer most query, for instance this looks like you are ordering on the final sub-query, and yes - it does look confusing:-

SELECT NOW = GETDATE()

UNION ALL

SELECT NOW = GETDATE()

ORDER BY NOW   -- This relates to the merged 'final' query

Using TOP

You almost always want to use ORDER BY and TOP together, however this does not work:-

SELECT TOP 1 name
FROM
sysobjects
ORDER BY name

UNION ALL

SELECT name
FROM sysobjects

ORDER BY name

It gives the error message:

Msg 156, Level 15, State 1, Line 4

Incorrect syntax near the keyword 'UNION'.

To get round this we can use what is called a derived table, a derived table can be thought of as a materialised temporary table that has no joins with the outer query (like a sub-query has).

SELECT name
FROM
(
   SELECT TOP
1 name
   
FROM sysobjects
   ORDER BY name
   ) AS d (name
)

UNION ALL

SELECT name
FROM sysobjects

ORDER BY name

The derived table bit is shown below, basically you can think of it being executed in isolation from the rest of the query, a bit like using SELECT .. INTO #T and uses #T... but the great thing about derived tables is that the optimiser understands what you are trying to do and may give a better query plan than if you'd have created a #T or table variable.

(
   SELECT TOP
1 name
   
FROM sysobjects
   ORDER BY name
   ) AS d (name )

Using Derived Tables

Say you want to do a TOP or further query the results of your UNION query, you can do this by wrapping the UNION into a derived table and do your query on the derived table :-

SELECT TOP 10 name
FROM (
   SELECT name
   FROM (
         SELECT TOP 1 name
         FROM sysobjects
         ORDER BY name
         ) AS d (name )

   UNION ALL

   SELECT name
   FROM sysobjects
   ) AS final_d

WHERE name LIKE 's%'
ORDER BY name

Remember what a derived table is - think of it as a materialsed temporary table within your query.

Getting cool! So, this opens up a lot of power for creating a highly scalable and performant system, you can in some instances combine a lot of temporary table work into a single statement.

You can now see other stuff, you can use GROUP BY, anything you could possibly want, another example...

SELECT type, COUNT(*), MAX( name ), MIN( name )
FROM
(
   SELECT name,
type
   FROM
(
      
SELECT TOP 1 name,
type
      FROM
sysobjects
      ORDER BY
name
      ) AS d ( name, type )

   UNION ALL

   SELECT name, type
   FROM sysobjects

   ) AS final_d

WHERE name LIKE 's%'
GROUP BY type

Having spent a morning catching up on invoicing I decided to make better use of my dead travelling time and purchase a PDA phone so I can enter my billing on the way home and have reporting services email me an invoice ready to send out to my client.

Anyway, the point of the post; look at this bizzare routing for my parcel which is coming from Manchester to Harpenden (5 miles from Luton airport)...

Date   Time   Location Service Area   Checkpoint Details

June 27, 2006   17:50   Manchester - UK   Shipment picked up

June 27, 2006   19:16   Manchester - UK   Departing origin

June 27, 2006   19:40   Manchester - UK   Departed from DHL facility in Manchester - UK

June 27, 2006   23:55   London-Heathrow - UK   Arrived at DHL facility in London-Heathrow - UK

June 28, 2006   06:45   London-Heathrow - UK   Departed from DHL facility in London-Heathrow - UK

June 28, 2006   07:30   Gatwick - UK   Arrived at DHL Facility

Talk about a trip, if only I was a plane spotter! The only airport its not been to is Stanstead, City and my local airport Luton, but there is time yet....

I also found it funny that it's gone from Heathrow down to Gatwick first as thats in completely the wrong direction.

Oh what fun....

When holding history in the same table there are times you need to 'end date' or deactivate the current row thereby making your history row.

This is difficult to do unless you are using a surrogate key that is based around an incremental counter for example using the IDENTITY property.

Here is an example of how to code it....

create table update_preceeding_row (
   
id int not null identity constraint sk_update_preceeding_row primary key clustered
,
   
somedata int not
null,
   
closed_yn char(1) null check( closed_yn in ( 'Y', 'N' )
),
   
closed_datetime datetime
null
)
go

create trigger trg_update_preceeding_row_close_previous_row
         on update_preceeding_row for
insert
as
begin

   if @@rowcount = 0
      return

   if @@rowcount > 1
   begin
      print
'Trigger only works for one row inserts'
      rollback tran
   
end

   update bt
      
set closed_yn = 'Y'
,
            
closed_datetime = getdate
()
   
from update_preceeding_row as
bt
   
where bt.id =
(
            
select max( bt2.id
)
            
from
inserted i
                     
inner join update_preceeding_row bt2 on bt2.id < i.
id
         )

end
go

insert update_preceeding_row ( somedata ) values( 1 )
select
*
from update_preceeding_row

insert update_preceeding_row ( somedata ) values( 2 )
select
*
from update_preceeding_row

insert update_preceeding_row ( somedata ) values( 3 )
select
*
from update_preceeding_row

drop table update_preceeding_row

How does this work? Lets look at the statement that is really doing the work...

   from update_preceeding_row as bt
   
where bt.id =
(
            
select max( bt2.id
)
            
from
inserted i
                     
inner join update_preceeding_row bt2 on bt2.id < i.
id
         )

For each row from the inserted table (a table materialised by SQL Server and contains the rows you are inserting or for an update the new values for the rows you are updating) we join to the base table and work out the preceeding ID, that is the row whose ID is less than our current working row when reading through the inserted table. We then pin point the row to update by joining back into the base table.

This technique only works because you know the order the rows where inserted, you could also use a datetime entry_date type column, but be wary, you may have duplicate rows and in which case you will probably get multiple rows in update_preceeding_row updated.

Note, I've updated this since first posting because David Portas correctly pointed out that an insert of multiple rows causes many rows to become active and also for many inserted rows how do you know which is the correct active row! May be in the future when I've time I'll revist the logic...

If you want to reindex all your tables in the database then this will do it...

declare objcur cursor for
   select
name
   from
sys.objects
   where type =
'u'
   order by
name

declare @obj sysname

open objcur

fetch next from objcur into @obj

while @@fetch_status = 0
begin
   print
@obj
   dbcc dbreindex( @obj )

   fetch next from objcur into @obj

end

deallocate objcur

Note, this won't necessarily mean that all your tables are contiguous in the database, SQL Server will use free extents within the database so you might actually do more harm then good but at least the index itself will be good, statistics for the optimiser up-to-date but you may have extent fragmentation. 

If like me you are used to truncating the log using BACKUP LOG <mydb> WITH TRUNCATE_ONLY or NO_LOG then you need to know that in a future version of SQL Server its being deprecated.

Instead you need to use the recovery model stuff:-

alter database pubs set recovery simple
go

checkpoint
go

alter database pubs set recovery full
go

backup database pubs to disk = 'c:\pubs.bak' with init
go

I understand the reason why they are deprecating this feature, basically when you use it the log chain is broken and you are exposed, but hey - what a useful feature it is within the development and release environment! One of these backward steps imho.

Ok, so while writing this I was having a mad moment and thought the CHECKPOINT stuff wasn't working but the space used reported by SQLPERF( LOGSPACE ) was throwing me, end of week and all that - well thats my excuse anyway; a better way to check is DBCC LOGINFO and looking for status = 2 which means that VLF is in use.

This week I've been tackling dumps of a different order, and if only I could get them straight to disk - lol! I probably will regret posting this once he's in his teens, but no doubt when I'm an old codger in a home and wear nappies myself then he'll have some fun at my expense too!

We started the week in the usual pull ups trainer nappies, and progressed to underpants - oh dear, poor carpet; anyway - Monday was a big accident day, some of Tuesday too, but Wednesday apartently he was dry all day (I was working in London on Wednesday), anyway, he's just done one in my office - that will teach me to take the eye off the ball, rather him! The trick seems to be not to mention having a wee but saying he should stay dry!

This morning was funny, well if you where a third party watching; i was helping a client quickly and my son was stood behind my chair in my office and suddenly faaaarrrrtttt.... oh dear, something's a foot or rather heading towards his foot; well, it wasn't that bad - lets just say I'm glad i hadn't had a drink last night.

The joys of parenthood, its been good, i suppose i'll join the groups of people you see occaisonally on the side of the road holding their kids in a squat position whilst peeing.....

The funniest or most worrying thing was a dream I had last night which I'm hoping is brought about by this whole pot training experience; basically I was at a Microsoft event, I remembered going to the toilet in the morning but later in the afternoon I suddenly realised I'd been walking around all day with no trousers on!

Anyway, when you next see me at a Microsoft event (funnily enough I've got an event of my own tonight) please oh please, if I'm not wearing any trousers please let me know....

 

 

   

My good friend Barry Dorrans IM'd me asking if I had a function to convert text into Camel Case but he found one before I could reply, anyway until 2005 we've had to pretty much resort to loops or cursors - but no longer! We can now take advantage of [TEXT()], PATH('') and FOR XML and do it in a single SELECT statement in a set based fashion!

We do need a sequence table for this but it can be a permanent table instead of the table variable i'm using just to show the action...

First we need our sequence table, again - probably best this being permanent as you don't want to build this each time...

set nocount on

declare @seq table (
   seq int not null primary
key
)

declare @i int
set
@i = 1
while @i <= 50
begin
   insert @seq values( @i
)
   
set @i = @i + 1

end

Now we can get on and do the logic, notice its really only a single statement, to break this down I will first demonstrate it using a local variable... 

declare @word varchar(100)

set @word = 'tony rogerson'

select camelcase = replace( word, '&#x20;', ' ' )
from
(
   select case when seq = 1 or substring( @word, seq-1, 1 ) = ' ' then upper( substring( @word, seq, 1 )
)
                     
else substring( @word, seq, 1 ) 
            end as [text()]
   from @seq
   where seq <= len( @word
)
   
for xml path( ''
)
      
) as t( word )

Thats it! If you have a table you want to CamelCase then we simply do this...

declare @names table (
   
word varchar(50) not
null

)

insert @names ( word ) values( 'tony rogerson' )
insert @names ( word ) values( 'barry dorrans'
)
insert @names ( word ) values( 'trevor dwyer'
)
insert @names ( word ) values( 'simon sabin' )

select camelcase = replace( 
         ( select case when seq = 1 or substring( n.word, seq-1, 1 ) = ' ' then upper( substring( n.word, seq, 1 )
)
                             
else substring( n.word, seq, 1 ) 
                     end as [text()]
            from @seq
            where seq <= len( n.word
)
            
for xml path( ''
)
            )
      
, '&#x20;', ' ' )

from @names n

So, what are we doing here? Well, we are taking advantage of the new [text()] feature of the FOR XML that allows values concatenation, unfortunetly it turns the single space into a control tag hence I do a replace on &#x20 to a single space.

Elegant in its simplicity and will scale extremely well - and of course, no need for a UDF anymore!!

Ok, so after Colin Leversuch-Roberts beating me up over not support '-' and 'von' I've expanded the logic thus...

declare @names table (
   word varchar(50) not
null

)

declare @breaks table (
   break_on_character char(1) not
null

)

declare @exclude table (
   
subtext varchar(20) not
null

)

insert @names ( word ) values( 'colin leversuch-roberts being a pain lol' )
insert @names ( word ) values( 'jim von trapp' )

insert @breaks ( break_on_character ) values( ' ' )
insert @breaks ( break_on_character ) values( '-' )

insert @exclude ( subtext ) values( ' von' )

select CamelCase = replace( 
      
( select case when seq =
                              or ( substring( n.word, seq-1, 1 ) IN ( select break_on_character from @breaks ) 
                                 and not exists ( select
*
                                                         
from @exclude e
                                                         where subtext = substring( n.word, seq-1, len( subtext ) )

                                    ) )
                           
then upper( substring( n.word, seq, 1 )
)
                           
else substring( n.word, seq, 1 ) 
                  end as [text()]
   from @seq
   where seq <= len( n.word
)
   for xml path( '' )

   )
   , '&#x20;', ' ' )

from @names n

Make sure you check out my more recent entry on removing the control characters, e.g. &amp; and &#x20; from the output instead of relying on lots of REPLACE's -> http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/07/06/871.aspx