in

SQL Server Blogs

Voices from the UK SQL Server Community

Atul Thakor

February 2011 - Posts

  • Performance problems loading XML with SSIS, an alternative way!

    I recently needed to load several thousand XML files into a SQL database, I created an SSIS package which was created as followed:

    Using a foreach container to loop through a directory and load each file path into a variable, the “Import XML” dataflow would then load each XML file into a SQL table.

     

     

    image

     

    Running this, it took approximately 1 second to load each file which seemed a massive amount of time to parse the XML and load the data, speaking to my colleague Martin Croft, he suggested the use of T-SQL Bulk Insert and OpenRowset, so we adjusted the package as followed:

     

    image

     

    The same foreach container was used but instead the following SQL command was executed (this is an expression):

     

     

    "INSERT INTO MyTable(FileDate)
    SELECT   CAST(bulkcolumn AS XML)
        FROM OPENROWSET(
            BULK
            '" + @[User::CurrentFile]  + "',
            SINGLE_BLOB ) AS x"

     

     

    Using this method we managed to load approximately 20 records per second, much faster…for data loading!

    For what we wanted to achieve this was perfect but I’ll leave you with the following points when making your own decision on which solution you decide to choose! 

     

     

    Openrowset Method

    • Much faster to get the data into SQL
    • You’ll need to parse or create a view over the XML data to allow the data to be more usable(another post on this!)
    • Not able to apply validation/transformation against the data when loading it
    • The SQL Server service account will need permission to the file
    • No schema validation when loading files

    SSIS

    • Slower (in our case)
    • Schema validation
    • Allows you to apply transformations/joins to the data
    • Permissions should be less of a problem
    • Data can be loaded into the final form through the package
    • When using a schema validation errors can fail the package (I’ll do another post on this)
Powered by Community Server (Commercial Edition), by Telligent Systems