This blog will no longer be updated.

New content is available on my new blog

The penultimate audit trigger framework - Piotr Rodak

The penultimate audit trigger framework

So, it’s time to see what I came up with after some time of playing with COLUMNS_UPDATED() and bitmasks. The first part of this miniseries describes the mechanics of the encoding which columns are updated within DML operation.

The task I was faced with was to prepare an audit framework that will be fairly easy to use. The audited tables were to be the ones directly modified by user applications, not the ones heavily used by batch or ETL processes. The framework consists of several tables and procedures. All these objects belong to Audit schema, to separate them from other objects in database. This means that you will have to either modify the framework script or create Audit schema in your database.

Internet is full of dreams of generic audit triggers. Developers look for magic trigger that will know how to handle table structure and will be easy to set up, and they will not have to modify code of the trigger for each table separately. Is it possible? Yes, but unfortunately, generic trigger approach has its drawbacks. Code that is too generic, will have to perform many operations at the time when the trigger is fired, what will affect performance. I decided to write a framework that will create triggers in generic way, but they will be crafted for particular tables they are created on.

I will describe how the framework works later in this post, lets for now list objects that the framework consists of:

  • Audit.tAuditGroup – table that stores data related to particular DML operations, like table schema and name, user friendly name (for reporting purposes), change date and credentials of the principal who caused the trigger to filre
  • Audit.tAuditDetails – This table stores primary key information for the changed rows and old and new values for all columns belonging to the audited table.
  • Audit.vAudit – view that joins the above tables providing comprehensive information about audited data.
  • Audit.tIgnoreColumns – table where you can specify column names that you don’t want to be audited – because there may be no point of auditing them. I added two names, UpdateDate and RowID as an example, but in general, if you leave this table empty, all columns will be audited.
  • Audit.tUserFriendlyLabels – Sometimes it is nice to have some meaningful or at least less cryptic names when a report is displayed. This table allows for mapping audited table names to some more user friendly labels, so you can display them as ‘Account value modification’ rather than dbo.tAccount. This table also has a DML trigger attached to insert and update events that outputs message reminding that trigger has to be regenerated to reflect the changes. I will explain this a bit later.
  • Audit.ptrig_GenericAudit – this procedure is called from every trigger generated by Audit framework. It takes a few parameters and requires existence of two tables, #ins and #del, which are populated from inserted and deleted pseudo-tables.
  • Audit.pCreateAuditTrigger – this is procedure to create audit trigger on a table. You pass table schema and name to the procedure, and name of the trigger is derived from these parameters.
  • Audit.pDropAuditTrigger – this procedure removes trigger from a table with specified schema and name.
  • Audit.vAuditTriggers – this view simply lists all audit triggers defined in the database. It may be useful to create a quick report with all audited tables without having to dig through scripted code.

How it works?

I decided to move as much as possible of work to the creation stage of the trigger. The procedure Audit.pCreateAuditTrigger gathers some metadata information about the table and generates actual trigger code. This dynamic code includes call to the Audit.ptrig_GenericAudit procedure, and static information is passed to it in parameters. This static information includes fragment of SQL used in join clause between deleted and inserted tables and primary key information.

select	@JoinClause = coalesce(@JoinClause + ' and', ' on') + ' i.' + c.COLUMN_NAME + ' = d.' + c.COLUMN_NAME,
			-- Get primary key select for insert - T-SQL version
			@PrimaryKey = coalesce(@PrimaryKey + ' and ','') + '[' + c.COLUMN_NAME + ']' +
			case when cols.CHARACTER_MAXIMUM_LENGTH is not null then
			' = ''''''''''''+convert(varchar(1000), coalesce(q.I_' + c.COLUMN_NAME + ', q.D_' + c.COLUMN_NAME + '))+''''''''''''' 
			when cols.DATETIME_PRECISION is not null then
			' = ''''+convert(varchar(23), coalesce(q.I_' + c.COLUMN_NAME +', q.D_' + c.COLUMN_NAME + '), 121)+''''' 
			else
			' = ''''+convert(varchar(1000),coalesce(q.I_' + c.COLUMN_NAME +', q.D_' + c.COLUMN_NAME + '))+''''' 
			end 
	from	INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk inner join INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
			on c.TABLE_NAME = pk.TABLE_NAME and c.TABLE_SCHEMA = pk.TABLE_SCHEMA and c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME
			inner join INFORMATION_SCHEMA.COLUMNS cols on cols.TABLE_NAME = c.TABLE_NAME 
						and cols.TABLE_SCHEMA = c.TABLE_SCHEMA and cols.COLUMN_NAME = c.COLUMN_NAME
	where 	pk.TABLE_NAME = @TableName and pk.TABLE_SCHEMA = @TableSchemaName
		and	CONSTRAINT_TYPE = 'PRIMARY KEY'

Initially I thought it might be useful to store the primary key information as XML fragment, but eventually I decided to code it as fragment of WHERE clause – all columns belonging to the primary key are joined by AND operator.

So to create a trigger on a table, you simply call

exec Audit.pCreateAuditTrigger 'Sales', 'SalesTerritory'

and to drop the trigger, call

exec Audit.pDropAuditTrigger 'Sales', 'SalesTerritory'

The above code creates trigger [Sales].[trgAudit_SalesTerritory] on the table Sales.SalesTerritory. When you look at the code of the generated trigger, you will see that the primary key fragment doesn’t contain any particular values but rather cryptic column references, and the reason is, that this fragment itself is used by dynamic SQL code in Audit.ptrig_GenericAudit.

exec Audit.ptrig_GenericAudit @UserName, @TableSchemaName, @TableName, @TableObjectId, 
@OperationType, @UTCUpdateDate, @ColumnsUpdated, 
' on i.TerritoryID = d.TerritoryID', 
'[TerritoryID] = ''+convert(varchar(1000),coalesce(q.I_TerritoryID, q.D_TerritoryID))+'''
, 'Sales territory'
 

Parameters passed to Audit.ptrig_GenericAudit are as follow:

  • UserName - contains name of the principal who triggered the operation,
  • TableSchemaName – name of the schema audited table is in
  • TableName – name of the table
  • TableObjectId – object id of the audited table
  • Operation – single character code for insert, update or delete
  • UTCUpdateDate – UTC date and time of the operation
  • ColumnsUpdated – bitmask of the updated columns
  • JoinClauseFragment – fragment of the dynamic sql that joins inserted and deleted tables on primary key columns
  • PrimaryKeyFragment – fragment of sql that allows for identification of the audited row
  • FriendlyName – label to display on a report, if there is need.

The framework supports composite primary keys. The resulting generated primary key code is converted to varchar(1000). If you need to have nvarchar data in the key, you have to modify the code to take this into account and modify the PK column in Audit.tAuditDetails. If your primary key can’t fit in 1000 characters, you can also change length of the PK column, but probably also think about design of the table – such long keys are usually sign that something went wrong during the design stage and you should consider shorter surrogate keys.

Let’s have a look how the Audit.ptrig_GenericAudit looks like:


	create procedure Audit.ptrig_GenericAudit
	(	@UserName sysname,
		@TableSchemaName sysname,
		@TableName sysname,
		@TableObjectId int, --id of the table in sys.objects
		@Operation char(1),
		@UTCUpdateDate datetime,
		@ColumnsUpdated varbinary(100), 
		@JoinClauseFragment nvarchar(3000), --fragment of SQL to append to the dynamic SQL to join inserted and deleted
		@PrimaryKeyFragment nvarchar(3000), --fragment that defines primary key. Values are retrieved in dynamic SQL.,
		@FriendlyName nvarchar(255) --user friendly name to appear on the screen
	)
	as
	begin
		/** 20100413 Piotr Rodak:	This procedure implements generic auditing code. It is called from triggers		***/
		/**						Procedure requires two temporary tables #ins and #del to exist. These have be	***/
		/**						populated with inserted and deleted pseudotables data in triggers before the	***/
		/**						procedure is called																***/

	declare @bit int ,
			@field int ,
			@maxfield int ,
			@char int ,
			@fieldname nvarchar(128) ,
			@sql nvarchar(max), 
			@selectList nvarchar(max), --script fragment with column names in the first part of unpivot query
			@crossapplyList nvarchar(max), --fragment for cross apply part of the unpivot query
			@ErrorMsg varchar(200), --error message that can be passed from internal sql
			@params nvarchar(200) -- common parameters of the query
		declare @ChangeId bigint
		
		
		--initialize parameters definition	
		set @params = '@ChangeId bigint, @ErrorMsg varchar(600) output'

		--select @ColumnsUpdated
		begin try
			--get new id for the change
/**1**/		insert Audit.tAuditGroup(Operation, TableSchema, TableName, Label, UTCUpdateDate, UserName)
			values (@Operation, @TableSchemaName, @TableName, @FriendlyName, @UTCUpdateDate, @UserName)
			set @ChangeId = scope_identity()
	
			;with column_updated_bytesCTE --divide bitmask into bytes. this query requires number table Admin.tTally in the database
			as
			(
				select number ByteNumber, convert(binary(1), 
				substring(@ColumnsUpdated, number, 1)) [ByteValue] 
				from Admin.tTally
				where number <= datalength(@ColumnsUpdated)
			),
			columnsCTE as --return columns belonging to table @TableObjectId, calculate appropriate bit masks
			(
				select column_id, [name] column_name, ByteNumber, ByteValue, 
						power(2, (a.column_id - 1) % 8) BitMask
				from sys.columns a inner join 
					column_updated_bytesCTE b 
						on ((a.column_id - 1) / 8) + 1 = b.ByteNumber
				where a.object_id = @TableObjectId
			)
			,
			updatedColumnIdsCTE as --get updated column ids, remove columns we don't want to audit (non core)
			(
				select c.column_id, convert(nvarchar(128), column_name) column_name, 
				case when ByteValue & BitMask > 0 then N'1' else N'0' end IsColumnChanged
					from columnsCTE c
					--where ByteValue & BitMask > 0 --commented out because even unchanged columns have to be audited
				where column_name not in 
					(select ColumnName from Audit.tIgnoreColumns where TableSchema is null 
						union all
					select ColumnName from Audit.tIgnoreColumns 
						where TableSchema = @TableSchemaName and TableName = @TableName )
			
			)
/**2**/			select 
				@selectList = coalesce(@selectList + N', ', N'') +
					N'convert(sql_variant, d.['  + column_name + ']) [D_' + column_name + N'], ' +
					N'convert(sql_variant, i.['  + column_name + ']) [I_' + column_name + N'], ' + 
					IsColumnChanged + N' [IsColumnChanged_' + column_name + N']' + nchar(10),
				
				@crossapplyList = coalesce(@crossapplyList + N' union all ' + nchar(10), '') + 
				N'select ''' + column_name + N''', [D_' + column_name + N']' + N', [I_' + column_name + N'], ' + N'[IsColumnChanged_' + column_name + N']'
			from updatedColumnIdsCTE 
			
			

			if(@@rowcount > 0)
			begin
				--print '@PrimaryKeyFragment= ' + @PrimaryKeyFragment
				
/**3**/				set @selectList = @selectList + 
					N'from #ins i full outer join #del d' + @JoinClauseFragment + N') q' + nchar(10) +
					N'cross apply ' 
				set @crossapplyList = N'( ' + nchar(10) + @crossApplyList + nchar(10) + N') unp(ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged)'
				
				select @sql = 
					N'begin try ' + nchar(10) + 
					N'insert Audit.tAuditDetails (ChangeID, PK, FieldName, OldValue, NewValue, IsColumnChanged)' + nchar(10) +
					N'select @ChangeId, PK, ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged' + nchar(10) +
					N'from ( select ''' + @PrimaryKeyFragment + N''' [PK], ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged from '+ + nchar(10) + 
					N'( select ' + nchar(10) +
					@selectList + nchar(10) +
					@crossapplyList + nchar(10) + N') outerQry' + nchar(10) +
					N'end try' + nchar(10) + 
					N'begin catch' + nchar(10) + 
						N'set @ErrorMsg = ''Audit error: '' + error_message() + '' ('' + convert(varchar, error_number()) + '')''' + nchar(10) +
						N'print ''dsql Error: '' + @ErrorMsg' + nchar(10) +
					N'end catch'
				--print @sql
				exec sp_executesql @sql, @params, @ChangeId, @ErrorMsg output
				if @ErrorMsg is not null --rethrow error if message not empty
					raiserror(@ErrorMsg, 16, 1)
			end

		end try
		begin catch
			declare @err nvarchar(1000)
			set @err = N'Procedure ' + error_procedure() + N': Error occured ('+ convert(nvarchar, error_number()) + N'): ' + error_message()
			raiserror(@err, 16, 1) with nowait;
		end catch
	end

There are three important stages of execution in this procedure.

/**1**/ – The DML operation is recorded and its id is retrieved to @ChangeId variable. This value is used subsequently in the ‘details’ table. After this insert, bitmap processing already discussed in my previous post takes place.

/**2**/ – The COLUMNS_UPDATED bitmap calculations result with data I can use to generate list of changed columns. This is place where you can decide if you want to save only changed columns, or rather everything. In my case, I store everything, and I have flag IsColumnUpdated that indicates if a column was updated during the DML operation. Please note that even if columns value hasn’t changed, but it was in the SET part of the UPDATE query, this flag will be set to 1. This means that you may want to additionally filter audit data to remove columns whose values didn’t change from the report.

/**3**/ – In this part of the procedure final sql code is put together. As you may notice (you can uncomment print @sql statement to see the generated code) I use cross apply operator in the query. The reason is that my initial approach proved to be quite slow and depended on the number of processed columns. I begun with the code posted by Nigel Rivett some time ago. The code worked, but I didn’t like the idea of having a loop in the trigger, and also the bitmap calculations are based on ordinal_position column. The ordinal_position works only if you don’t remove any columns from the table (unless you remove last column each time). So I changed the loop code to set-based code, which did more or less the same job. As you may notice, the code joins inserted and deleted tables for each of the columns. if you have 5 columns, it’s OK. If you have 50 – it is less than OK. I came across the performance issue and looked for a solution for a while. I realized that to improve performance I need to join the inserted and deleted tables rather once per operation than for every column. So unpivot was the answer. This is not a very well known table operator, and this is a pity, because it can be really useful in many cases. I wonder, why there is practically no documentation about it in Books Online.

The problem with unpivot operator is that if the rotated columns contain null values, they are removed from the resulting rowset. There is an excellent discussion of this operator published some time ago by Brad Schultz: part 1 and part 2. It turns out that unpivot can be successfully replaced with cross apply, which gives you ability to include nulls to the resulting rowset.

I uncommented the print statement in the generated code and ran the update statement:

update Sales.SalesTerritory set [Name] = 'Northwest 1' where TerritoryId = 1

The generated code is as follows:

begin try 
insert Audit.tAuditDetails (ChangeID, PK, FieldName, OldValue, NewValue, IsColumnChanged)
select @ChangeId, PK, ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged
from ( select '[TerritoryID] = '+convert(varchar(1000),coalesce(q.I_TerritoryID, q.D_TerritoryID))+'' [PK], ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged from 
( select 
convert(sql_variant, d.[TerritoryID]) [D_TerritoryID], convert(sql_variant, i.[TerritoryID]) [I_TerritoryID], 0 [IsColumnChanged_TerritoryID]
, convert(sql_variant, d.[Name]) [D_Name], convert(sql_variant, i.[Name]) [I_Name], 1 [IsColumnChanged_Name]
, convert(sql_variant, d.[CountryRegionCode]) [D_CountryRegionCode], convert(sql_variant, i.[CountryRegionCode]) [I_CountryRegionCode], 0 [IsColumnChanged_CountryRegionCode]
, convert(sql_variant, d.[Group]) [D_Group], convert(sql_variant, i.[Group]) [I_Group], 0 [IsColumnChanged_Group]
, convert(sql_variant, d.[SalesYTD]) [D_SalesYTD], convert(sql_variant, i.[SalesYTD]) [I_SalesYTD], 0 [IsColumnChanged_SalesYTD]
, convert(sql_variant, d.[SalesLastYear]) [D_SalesLastYear], convert(sql_variant, i.[SalesLastYear]) [I_SalesLastYear], 0 [IsColumnChanged_SalesLastYear]
, convert(sql_variant, d.[CostYTD]) [D_CostYTD], convert(sql_variant, i.[CostYTD]) [I_CostYTD], 0 [IsColumnChanged_CostYTD]
, convert(sql_variant, d.[CostLastYear]) [D_CostLastYear], convert(sql_variant, i.[CostLastYear]) [I_CostLastYear], 0 [IsColumnChanged_CostLastYear]
, convert(sql_variant, d.[rowguid]) [D_rowguid], convert(sql_variant, i.[rowguid]) [I_rowguid], 0 [IsColumnChanged_rowguid]
, convert(sql_variant, d.[ModifiedDate]) [D_ModifiedDate], convert(sql_variant, i.[ModifiedDate]) [I_ModifiedDate], 0 [IsColumnChanged_ModifiedDate]
from #ins i full outer join #del d on i.TerritoryID = d.TerritoryID) q
cross apply 
( 
select 'TerritoryID', [D_TerritoryID], [I_TerritoryID], [IsColumnChanged_TerritoryID] union all 
select 'Name', [D_Name], [I_Name], [IsColumnChanged_Name] union all 
select 'CountryRegionCode', [D_CountryRegionCode], [I_CountryRegionCode], [IsColumnChanged_CountryRegionCode] union all 
select 'Group', [D_Group], [I_Group], [IsColumnChanged_Group] union all 
select 'SalesYTD', [D_SalesYTD], [I_SalesYTD], [IsColumnChanged_SalesYTD] union all 
select 'SalesLastYear', [D_SalesLastYear], [I_SalesLastYear], [IsColumnChanged_SalesLastYear] union all 
select 'CostYTD', [D_CostYTD], [I_CostYTD], [IsColumnChanged_CostYTD] union all 
select 'CostLastYear', [D_CostLastYear], [I_CostLastYear], [IsColumnChanged_CostLastYear] union all 
select 'rowguid', [D_rowguid], [I_rowguid], [IsColumnChanged_rowguid] union all 
select 'ModifiedDate', [D_ModifiedDate], [I_ModifiedDate], [IsColumnChanged_ModifiedDate]
) unp(ColumnName, OldColumnValue, NewColumnValue, IsColumnChanged)
) outerQry
end try
begin catch
set @ErrorMsg = 'Audit error: ' + error_message() + ' (' + convert(varchar, error_number()) + ')'
print 'dsql Error: ' + @ErrorMsg
end catch

As you see, the generated code joins deleted and inserted tables (copied to respective temporary tables) only once per query, and cross apply operator is used to unpivot the columns to name value pairs. The result of the update is in the following table:

select * from Audit.vAudit

 

ChangeId Operation TableSchema TableName Label PK FieldName OldValue NewValue IsColumnChanged UTCUpdateDate UserName
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 TerritoryID 1 1 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 Name Northwest Northwest 1 1 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 CountryRegionCode US US 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 Group North America North America 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 SalesYTD 5767341.9752 5767341.9752 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 SalesLastYear 3298694.4938 3298694.4938 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 CostYTD 0.00 0.00 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 CostLastYear 0.00 0.00 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 rowguid 43689A10-E30B-497F-B0DE-11DE20267FF7 43689A10-E30B-497F-B0DE-11DE20267FF7 0 2010-05-10 22:22:38.247 Amilo\Rogas
1 U Sales SalesTerritory Sales territory [TerritoryID] = 1 ModifiedDate 1998-06-01 00:00:00.000 1998-06-01 00:00:00.000 0 2010-05-10 22:22:38.247 Amilo\Rogas

As you can see all column values are stored in the audit table. You can easily change this if you choose to do so, but in my case, this was the requirement. To help identify changed columns, there is IsColumnChang

ed column in the view. Mind though, that if you run update statement that touches a column within the SET clause, it will be reported as changed even if there was no actual change of the value.

TODO – the framework doesn’t support LOB types, as there is no explicit conversion to sql_variant data type. If you need LOB types to be audited, you can convert them to corresponding “short” types, like varchar(8000) and nvarchar(4000), but you may lose some data with such conversion. The other option would be to create separate LOB audit columns to keep these values.

 

I attach script that installs the framework, you can download it using this link.

Be careful when running it twice, because it removes audit tables and any information that is stored in them.

 

This would conclude this post and the audit framework miniseries. Please feel free to add any comments you have below.

 

Comments

# Visual Studio 2005 and SQL Server problem? | BingSite

Pingback from  Visual Studio 2005 and SQL Server problem? | BingSite

# COLUMNS_UPDATED() for audit triggers - Piotr Rodak

Pingback from  COLUMNS_UPDATED() for audit triggers - Piotr Rodak