Sunday, August 5, 2007 11:56 AM tonyrogerson

Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Yet another take on the CSV route for passing an array to a stored procedure, this one is clever though – it’s got full data validation because I load the values into a table, but with no loops, no self-joins; just dynamic SQL and the REPLACE function. The concept is very simple, I’ve bloated the procedure out to give nice data error handling; so if one of your CSV values fails the validation (CHECK constraint for instance) then I tell you what the value position is.

I’ve also coded in (it’s so simple) so check flags that you specify if you want to allow only unique values, or if you only want to allow not null values.

The procedure is below; I’ve not tested it for scalability but because of the data validation it won’t be as good as straight dynamic SQL, also, it would probably be better with a table variable but that would make the coding just a little bit more tricky J.

Example 1 – Only allow unique values

exec csv_eg @csv = ',11,12,13,14,,,,,,,,,,,,,,12,',

            @is_allow_nulls = 'Y',

            @is_unique_values = 'Y',

            @is_debug = 'N'


Msg 2601, Level 14, State 1, Line 12

Cannot insert duplicate key row in object 'dbo.#csv_split' with unique index 'ncui'.

The statement has been terminated.

Msg 50000, Level 16, State 1, Line 44

Data Error on value position 6, processing terminated.

 

Note – it tells you which position causes the error (value position 6) because that is the second NULL to be inserted.

Example 2 – Don’t allow NULL’s/Empty values

exec csv_eg @csv = ',11,12,13,14,,,,,,,,,,,,,,12,',

            @is_allow_nulls = 'N',

            @is_unique_values = 'N',

            @is_debug = 'N'

Msg 515, Level 16, State 2, Line 10

Cannot insert the value NULL into column 'csv_value', table 'tempdb.dbo.#csv_split__________________________________________________________________________________________________________000000000027'; column does not allow nulls. INSERT fails.

The statement has been terminated.

Msg 50000, Level 16, State 1, Line 44

Data Error on value position 5, processing terminated.


As you can see, we get the error, and it the position on the csv that caused it.

Example 3 – CHECK constraint firing...

exec csv_eg @csv = ',11,12,13,14,1,,,,,,,,,,,,,12,',

            @is_allow_nulls = 'Y',

            @is_unique_values = 'N',

            @is_debug = 'N'


Msg 547, Level 16, State 0, Line 10

The INSERT statement conflicted with the CHECK constraint "CK__#csv_spli__csv_v__09F4B950". The conflict occurred in database "tempdb", table "dbo.#csv_split__________________________________________________________________________________________________________000000000028", column 'csv_value'.

The statement has been terminated.

Msg 50000, Level 16, State 1, Line 44

Data Error on value position 5, processing terminated.


If I was inclined, I’d make the CHECK constraint definition dynamic as well but I can’t do everything for you ;).

The other good thing about this is that because we have used a # table then SQL Server will have statistics on the number of rows and cardinality on that # table so it will give us a good query; if you’d used the multiple parameter approach as suggested by Joe Celko SQL Server could only sniff the parameter but whether that would give it some good info for your big IN clause I’m not sure. The other good thing about this method is that it’s a table join rather than an IN so it will be more efficient for more values.

CREATE PROC csv_eg

    @csv            varchar(500),

    @is_allow_nulls   char(1) = 'Y',

    @is_unique_values char(1) = 'N',

    @is_debug   char(1) = 'N'

AS

BEGIN

    /***

        Convert a CSV to a TABLE with data validation

 

        Tony Rogerson, Aug 2007

 

     ***/

 

    SET @csv = ltrim(rtrim(@csv))


   
IF PATINDEX( '%[^,0-9]%', @csv ) > 0    -- Checks to make sure input is digits or comma to prevent SQL injection

    BEGIN

        RAISERROR( 'Injection attempt or invalid data.', 16, 1 )

        RETURN

 

    END

    --  Initial parse of input to make sure general syntax of the csv is valid

   

    --  check we have commas correct

 

    IF LEFT( @csv, 1 ) <> ','       --  If not front comma then add one

        SET @csv = ',' + @csv

    IF RIGHT( @csv, 1 ) = ','       --  If last character is a comma then add a NULL after it

        SET @csv = @csv + 'NULL'

 

    SET @csv = REPLACE( @csv, ',,', ',NULL,' )  --  Change ,, to NULL

    SET @csv = REPLACE( @csv, ',,', ',NULL,' )  --  Need to do it twice for rest of ,, not caught in first one

 

    --  // initial parse is over.

 

    --  Now the donkey work, convert it from a flat CSV structure to a TABLE

    --  Simple string replacing, no need for self joins and loops

 

    DECLARE @sql varchar(max)

 

    SET @sql = REPLACE( @csv, ',', CHAR(13) + CHAR(10) + 'insert #csv_split ( csv_value ) values( ' )

    SET @sql = REPLACE( @sql, CHAR(13), ' ); @V' + CHAR(13) )

    SET @sql = RIGHT( @sql, LEN( @sql ) - 7 ) + ' ); @V'

 

    SET @sql = REPLACE( @sql, '@V', '

if @@error <> 0 goto err;' )

 

    --  We now have the SQL that will convert CSV to a TABLE, so exec it.

    CREATE TABLE #csv_split (

        val_position    smallint    not null IDENTITY,

        csv_value       int null CHECK( csv_value between 10 and 20 )

    )

 

    IF @is_allow_nulls = 'N'

        EXEC( 'ALTER TABLE #csv_split ALTER COLUMN csv_value INT NOT NULL' )

    IF @is_unique_values = 'Y'

        EXEC( 'CREATE UNIQUE INDEX ncui ON #csv_split( csv_value )' )

 

    SET @sql = 'SET NOCOUNT ON; ' + @sql + ';

    goto done;

err:

    DECLARE @msg varchar(120);

    SET @msg = ''Data Error on value position '' + CAST( ( SELECT COUNT(*) FROM #csv_split ) + 1 AS varchar(5) ) + '', processing terminated.'';

    RAISERROR( @msg, 16, 1 );

done:'

 

    IF @is_debug = 'Y'

        PRINT @sql

 

    EXEC( @sql )

 

    IF @@ERROR <> 0

        RETURN

 

    SELECT *

    FROM #csv_split

 

END

Filed under:

Comments

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Sunday, August 5, 2007 1:34 PM by Uri Dimant

Hi Tony

I can't stop be astonished of your so much clever ideas. I was certain that I read everuthing about passing arrays into stored procedure but this one is so subtle... If you permit I'd like to add ( and use) to this script my collection.

Thanks

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Sunday, August 5, 2007 3:25 PM by tonyrogerson

Hi Uri,

<Blushing>Thanks for your complement</>.

Feel free to add to and use this; if I get time thru the week I'll have a look at making it work for string data but I'm working on a few blog posts on surrogate key usage.

The concept is quite simple isn't it; it's a good solution if you don't want to give access to the base tables and in 2005 you don't want to use the execute as method.

One note, SQL 2000 you are limited to 8000 characters, but I guess I could make it work with more with a little thought.

Tony.

# Interesting Finds: August 5, 2007

Sunday, August 5, 2007 4:40 PM by Jason Haley

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Thursday, August 9, 2007 12:43 PM by MarkC

Rather than generating multiple insert statements

insert #csv_split ( csv_value ) values( x1 )

insert #csv_split ( csv_value ) values( x2 )

insert #csv_split ( csv_value ) values( x3 )

.

.

.

might it be more efficient to generate a single insert such as this

insert #csv_split ( csv_value )

select x1 union all

select x2 union all

select x3 union all

.

.

.

I think this requires only a fairly trivial change to your stored procedure

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Thursday, August 9, 2007 5:10 PM by tonyrogerson

The validation detection wouldn't work then; I wouldn't know which value in the list failed.

It would make it quicker though - absolutely.

Tony.

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Monday, August 13, 2007 10:56 AM by mikr0s

SQL Injection:

exec csv_eg @csv = ',11); SELECT GETDATE(,12',

           @is_allow_nulls = 'Y',

           @is_unique_values = 'N',

           @is_debug = 'Y'

--HTH--

# re: Passing an array (CSV) to a stored procedure with data validation, no loops, no self joins - just REPLACE

Monday, August 13, 2007 12:03 PM by tonyrogerson

Thanks mikr0s,

I've modified the example to stop SQL injection.

PATINDEX - don't you just love it, it would be interesting to see how I'd do the same for text, probably some sort of tokenisation.

Many thanks for the heads up.

Tony

# csv list of elements as parameter for stored procedure

Sunday, November 25, 2007 12:19 AM by Piotr Rodak

A while ago, Tony Rogerson showed a way how to pass a list of integers (csv) to stored procedure. Approach

# Need Query Where Field is One of Any Number of Variables | keyongtech

Pingback from  Need Query Where Field is One of Any Number of Variables | keyongtech

# Sql server 2005 - effective split function | keyongtech

Tuesday, February 24, 2009 8:08 PM by Sql server 2005 - effective split function | keyongtech

Pingback from  Sql server 2005 - effective split function | keyongtech