SSIS Lookup Transformation Dynamic SQL workaround

In SSIS, Lookup Transformation is really an important tool. But the design of it prevents user to set the SQL statement dynamicly.

In my work, I need to prepare a text file for BULK loading into table. The original csv file could be 10 million ~ 1 billion rows. On the way I need to lookup several IDs by Names from different tables. These reference tables all have a Project_ID column, which can sub-grouping data for a specific project. Normally the original csv file only holds data for a single project.

Of cause I can just do:

SELECT xx_ID, xx_Name FROM dbo.TABLE1


to fill the Lookup Tran with full cache. But doing this will: use lots of memory; waste resource; and bad performance.

I would rather want to do:

SELECT xx_ID, xx_Name FROM dbo.TABLE1 WHERE Project_ID = ?

Which the Project_ID value is only known at runtime by user input parameter. Unfortunately Lookup Tran does not have that enabled (yet, maybe will in future version.)

So I came up with this idea: BCP dump + OPENROWSET (BULK).

First we use an Execute Process Task, to call Bcp.exe, specify the Arguments using Expressions, the Project_ID will be a SSIS variable. This is like you run this line in CMD:

c:\Bcp.exe "SELECT xx_ID, xx_Name FROM dbo.TABLE1 WHERE Project_ID = 101" queryout tempref1.txt -N -T

Then you will get a text file with columns you want, for the project that is relevant to this run. The reason to specify -N is to save char like in unicode and everything else in SQL server native format, this can save a lot both on file size and I/O speed for later reading process.

Second step, configure our Lookup Trans by set it to Use results of an SQL query, the query should be like this:

SELECT a.* FROM OPENROWSET( BULK 'c:\tempref1.txt', FORMATFILE = 'c:\tempref1.fmt') AS a

I didn't mention the step to create the format file.

If the SQL statement is correct and both the txt and fmt files are ok, when you click the Preview button, you'll see a data grid. Then just continue your normal configuration of Lookup Trans.

So far this works for my situation. I can do multiple Lookups by Bcp to different file name. Obviously I can't fire two instances of the package at the same time. Also I can't set the path and file name at runtime too.

Any comments are welcomed! (Maybe I can do a sample package if anyone interested, in Northwind or AdventureWorks context?)
Attachment: SSIS_Lookup.PNG
Published 13 July 2006 15:17 by dong
Filed under:

Comments

# re: SSIS Lookup Transformation Dynamic SQL workaround

Could you use a view?

Rather than your process task have an execute SQL task that changes the definition of the view. You have need to use 2, one to drop the view if it exists and the other to create the view (you can't have multiple statements in one execute sql task)

13 July 2006 21:20 by simonsabin

# re: SSIS Lookup Transformation Dynamic SQL workaround

Hi, Simon,
Thanks for the comments. Yes, I did think of View. But that will be another Workaround, isn't it? Both have limitations and benefits.

E.g. Create and Drop View can't work if you don't have the privilege on the db.

Also I would think Bcp+BULK is a No-touch, clean solution to the db. The View solution will mess up the db if the package is run many times per day.

17 July 2006 12:30 by dong