XML Validation in SQL Server Intergration Services

I've been workig with XML in SQL Server Integration services this week and to say the XML task in SQL Server Integration services sucks is an understatement. Its buggy, the documentation is woefully inadequate in some cases wrong. I believe this was one of the first components written back for SQL 2005 so you can imagine it was written along time ago, XML wasn't as standardised.

So what if you want to validate an XML document against a XSD Schema. Well in true SQL Server Integration services mode the answer is to do it yourself with a script task.

There are two ways of doing it, the first uses an XmlDocument object and the other is the XmlReader. If you were on my XML course at SQLBits you will remember that I pointed out the importance of straeming when dealing with large data and this is where these two differ. The XmlDocument loads the document into memory, whereas the XmlReader is a streaming interface to the XML document you have. So if you have a large document you should use the XmlReader.

In my case however I wanted to pass the XML to a stored procedure after it had been validated and so I used the XmlDocument.

The first step is to create a XmlSchemaSet and put your schema into it.

string Filepath = Dts.Variables["User::FullNewFilename"].Value.ToString();

string Schema= Dts.Variables["User::SchemaFilepath"].Value.ToString();

 

// Create a schema set

XmlSchemaSet schemaset = new XmlSchemaSet();

// Add the schema to the collection.

schemaset.Add(XmlSchema.Read(XmlReader.Create (Schema),null));

Here Schema is a variable to the path for the schema and Filepath is the variable to the Xml file to be validated.

After that you have to either use the XmlDocument or XmlReader. For the XmlDocument we are passing back the contents of the file using the InnerXml property.

//clear our errors string builder

ValidationErrors = new StringBuilder();

 

//Use this if you need to read the document for use elsewhere

XmlDocument doc = new XmlDocument();

doc.Schemas = schemaset ;

doc.Load(Filepath );

doc.Validate(new ValidationEventHandler(ValidationCallBack));

           

//Load the contents of file into the variable for later processing.

Dts.Variables["ImportFileContents"].Value = doc.InnerXml;//clear our errors string builder

or use the XmlReader, this requires another object the XmlReaderSettings object

//set the validation settings.

XmlReaderSettings settings = new XmlReaderSettings();

settings.ValidationType = ValidationType.Schema;

settings.Schemas = schemaset ;

settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);

             

using (XmlReader xr = XmlReader.Create(Filepath,settings ))

{

    //This reads the whole document through the reader and does the validation

    while (xr.Read()) ;

 

    //Close the reader

    xr.Close();

}

In both you will see that they have references to the ValidationCallBack. The validation model is a callback model and so you need this extra function. This is why we are using a global variable ValidationErrors to store the errors and then be able to access them after the validation succeeds. The validation callback function is described here

// Display any validation errors.

private void ValidationCallBack(object sender, ValidationEventArgs e)

{

    ErrorOccurred = ErrorOccurred || e.Severity == XmlSeverityType.Error;

    Warning= Warning|| e.Severity == XmlSeverityType.Error;

 

    if (ValidationErrors.Length > 0)

    {

        ValidationErrors.AppendLine();

    }

    ValidationErrors.Append(e.Message);

}

Complete Code

So putting it all together we get

StringBuilder ValidationErrors;

bool ErrorOccurred= false;

bool Warning = false;

 

public void Main()

{

    string Filepath = Dts.Variables["User::FullNewFilename"].Value.ToString();

    string Schema= Dts.Variables["User::SchemaFilepath"].Value.ToString();

 

    // Create a schema set

    XmlSchemaSet schemaset = new XmlSchemaSet();

    // Add the schema to the collection.

    schemaset.Add(XmlSchema.Read(XmlReader.Create (Schema),null));

 

    //clear our errors string builder

    ValidationErrors = new StringBuilder();

 

    //Use this if you need to read the document for use elsewhere

    XmlDocument doc = new XmlDocument();

    doc.Schemas = schemaset ;

    doc.Load(Filepath );

    doc.Validate(new ValidationEventHandler(ValidationCallBack));

           

    //Load the contents of file into the variable for later processing.

    Dts.Variables["ImportFileContents"].Value = doc.InnerXml;

 

 

    /*  //Otherwise use the following

    // Set the validation settings.

    XmlReaderSettings settings = new XmlReaderSettings();

    settings.ValidationType = ValidationType.Schema;

    settings.Schemas = schemaset ;

    settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);

 

    using (XmlReader xr = XmlReader.Create(Filepath,settings ))

    {

        //This reads the whole document through the reader and does the validation

        while (xr.Read()) ;

 

        //Close the reader

        xr.Close();

    }*/

           

    if (ValidationErrors.Length > 0)

        Dts.Variables["XMLError"].Value = ValidationErrors.ToString();

 

    Dts.Variables["Validation"].Value = ErrorOccurred?"Error":(Warning?"Warning":"");

}

 

// Display any validation errors.

private void ValidationCallBack(object sender, ValidationEventArgs e)

{

    ErrorOccurred = ErrorOccurred || e.Severity == XmlSeverityType.Error;

    Warning= Warning|| e.Severity == XmlSeverityType.Error;

 

    if (ValidationErrors.Length > 0)

    {

       ValidationErrors.AppendLine();

    }

    ValidationErrors.Append(e.Message);

}

Enjoy.


Published Wednesday, January 20, 2010 11:30 PM by simonsabin

Comments

Thursday, January 21, 2010 4:01 PM by SqlServerKudos

# XML Validation in SQL Server Intergration Services

Kudos for a great Sql Server article - Trackback from SqlServerKudos

# Something for the weekend: SQL Server Links 29/01/10 | John Sansom - SQL Server DBA in the UK

Pingback from  Something for the weekend: SQL Server Links 29/01/10 | John Sansom - SQL Server DBA in the UK