Piotr Rodak

if datepart(dw, getdate()) in (6, 7)
use pubs;

November 2007 - Posts

A while ago, Tony Rogerson showed a way how to pass a list of integers (csv) to stored procedure. Approach of creating a script and executing it is OK for smaller amounts of data. I thought, that maybe, as xml is a form of text after all, it would be more appropriate? I crafted a stored proc based on Tony's code that instead of generating a script, generates an xml stream that is subsequently used in query inserting rows. This method is also resistant to sql injection attemtpts so such stored proc is not a 'bobby tables' one :)

Here's the code of the procedure:

create table tStoredInts (afield int)

go

create proc csv_to_table2

@csv varchar(max)

AS

BEGIN

/***

Insert numbers from a CSV to tStoredInts table

***/

declare @handle int --xml document handle

SET @csv = ltrim(rtrim(@csv))

IF RIGHT( @csv, 1 ) = ',' -- If last character is a comma remove it it

SET @csv = left(@csv, len(@csv) -1)

DECLARE @xml varchar(max)

SET @xml = REPLACE( @csv, ',', '</o><o>')

SET @xml = '<numbers><o>' + @xml + '</o></numbers>'

 

exec sp_xml_preparedocument @handle output, @xml

--insert elements from xml

insert tStoredInts (afield)

select field

from openxml(@handle, '/numbers/o', 2) with (field int '.')

exec sp_xml_removedocument @handle

end --proc

Posted by Piotr Rodak | 1 comment(s)
Filed under: ,

So, I just came from seminar led by Kalen Delaney, guru of SQL Server (any version), organized by Ireland SQL Technology User Group. The whole event was sponsored and hosted by Microsoft. I must say, I am impressed. Kalen was able to sort out and put into proper drawers all bits of knowledge I had before and stuff a whole lotta more into my head. She talked about concurrency, transaction isolation levels, differences between optimistic and pessimistic locking, blocking and diagnostic tools, and at the end about gotchas related to (nolock) hint, overused so eagerly by many devs (I confess, I used this hint myself ;))

You can find tons of good info about execution plans and isolation levels at Craig Friedman's blog here

Overall, it was a very interesting day!


 

Posted by Piotr Rodak | with no comments
Filed under:

So, you might think that sorting within ASCII range is predictable and order defined by ASCII table is finite? You are on safe side when you do not use unicode nor 'fancy' letters? Well.. you are wrong.

Recently I have been working on comparison of data stored in Sybase and SQL. Database structure is the same, I have script pulling data from Sybase to SQL Server and when this script is finished, data are meant to be identical of course. So I was thrown when we found out that rows can be returned in some cases in different order on both databases. That is, when a varchar field contained two underscore characters in a row, rows sorted by this field could be returned in different order.

After some thinking and looking for information I was able to create script that can reproduce this behavior on SQL Server.

First, create table and fill it with some data:

create table testtable

(

afield varchar(25)

)

go

insert testtable (afield)

select 'A_'

union all

select 'A__'

union all

select 'A_B'

union all

select 'A__B'

 Now select rows ordering by afield:

select afield from testtable order by afield

The results are pretty 'reasonable':

afield

-------------------------

A_

A__

A__B

A_B

 Ok, but Sybase returned different order:

afield

-------------------------

A_

A_B

A__

A__B

The mystery is hidden within collation of the field. Character fields derive default collation from database setting, which in turn gets collation if it is not specified explicitly from server settings.

If you run following query, you will get results identical with the previous result:

alter table testtable alter column afield varchar(25) collate Latin1_General_Bin

This query applies explicit collation fo afield, overriding in this way default collation. You can check collation of afield using this query:

select column_id, name, collation_name from sys.columns where object_id = object_id('testtable')

column_id   name        collation_name

----------- ----------- -------------------

1           afield      Latin1_General_BIN

This was the collation after change, compliant with Sybase collation in my case.

Default collation on my SQL Server is

column_id   name      collation_name

----------- --------- ------------------------------

1           afield    SQL_Latin1_General_CP1_CI_AS

As you see, collation setting may affect sort order of fields which do not contain unicode characters. When I run the same query on both database engines, I get mostly same results with exception of values containing multiple '_' characters. The solution is simple: specify collation of character fields to collation compatible with settings of the other database server.

Note that modifying default collation of database after tables are created does not affect collation of these fields. You can't also change collation in a column if this column is part of an index.

 

Posted by Piotr Rodak | with no comments