This blog will no longer be updated.

New content is available on my new blog

COLUMNS_UPDATED() for audit triggers - Piotr Rodak

COLUMNS_UPDATED() for audit triggers

In SQL Server 2005, triggers are pretty much the only option if you want to audit changes to a table. There are many ways you can decide to store the change information. You may decide to store every changed row as a whole, either in a history table or as xml in audit table. The former case requires having a history table with exactly same schema as the audited table, the latter makes data retrieval and management of the table a bit tricky. Both approaches also suffer from the tendency to consume big amounts of space, especially for wide tables. There is usually no sense of storing unchanged data, in 2 copies for update operations (from deleted and inserted psuedotables). You can also choose the third approach, to store only changed data in the audit table. This post focuses on methods of identifying columns changed by the DML statements. Let’s start with creating a sample table.


if exists(select 1 from sys.tables where object_id = object_id('[dbo].[tManyColumns]')
drop table [dbo].[tManyColumns]

GO

create
table [dbo].[tManyColumns]
(
[c1] [int] NOT NULL,
[c2] [int] NOT NULL,
[c3] [int] NULL,
[c4] [int] NULL,
[c5] [int] NULL,
[c6] [int] NULL,
[c7] [int] NULL,
[c8] [int] NULL,
[c9] [int] NULL,
[c10] [int] NULL,
[c12] [int] NULL,
[c13] [int] NULL,
[c14] [int] NULL,
[c15] [int] NULL,
[c16] [int] NULL,
[c17] [int] NULL,
[c18] [int] NULL,
[c19] [int] NULL,
constraint [PK1] primary key clustered
(
[c1] ASC,
[c2] ASC
)
)
ON [PRIMARY]

go
--populate the table

SET NOCOUNT ON;
SET XACT_ABORT ON;
GO

BEGIN
TRANSACTION;
INSERT INTO [dbo].[tManyColumns]([c1], [c2], [c3], [c4], [c5], [c6], [c7],
[c8], [c9], [c10], [c12], [c13], [c14], [c15], [c16], [c17], [c18], [c19])
SELECT 1, 3, 32, 14, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, 88, NULL, NULL UNION ALL
SELECT
1, 4, 32, 14, NULL, NULL, NULL, NULL, NULL, 1, NULL, NULL, NULL,
NULL, NULL, 88, NULL, NULL UNION ALL
SELECT
2, 1, 32, 14, NULL, 1, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, 11, 88, NULL, NULL
COMMIT
;
RAISERROR (N'[dbo].[tManyColumns]: Insert Batch: 1.....Done!', 10, 1) WITH NOWAIT;

GO
--check contents of the table
select * from dbo.tManyColumns

go

Please note that script with INSERT statements was generated by excellent SSMS Tools Pack.

Triggers in SQL Server are fired once for data change operation (insert, update, delete), not for every row. This is important to know, many developers assume that trigger is called for every row within the updated set. Interestingly, if there are no rows updated, trigger is fired anyway (for example update testtbl where 1 = 0). How do you know a trigger was fired for particular operation? You have two choices here.
You can create a separate trigger for each of the operations:

create trigger trigDel on dbo.tManyColumns for delete 
as
print 'DELETE trigger'

GO

create
trigger trigUpd on dbo.tManyColumns for update
as
print 'UPDATE trigger'

GO

create
trigger trigIns on dbo.tManyColumns for insert
as
print 'INSERT trigger'

This approach while clearly separates particular actions, also has tendency to multiply practically the same code. So most implementations I have seen use other approach. You can test inserted and deleted pseudo-tables available in trigger for presence of data. In case of DELETE, there will be no rows in inserted table and for INSERT, deleted table will be empty. For UPDATE, both tables will be populated, inserted with new data and deleted with old data. I never realized this before, but you shouldn't assume that the trigger is fired for DELETE because there are no rows in inserted, because it may be fired for UPDATE where no rows match the criteria - and both inserted and deleted tables will be empty.
How to know which columns have been changed?
SQL Server provides two methods of finding out which columns have been updated during particular operation:
UPDATE( column ) - This function returns true if particular column is updated, false otherwise.
COLUMNS_UPDATED() - This function returns one or more bytes that contain bitmaps representing updated (and not updated) columns.

So, you can implement a trigger that will react to particular column update more or less like this:

create trigger trigCol1 on dbo.tManyColumns for insert, update, delete 
as
if UPDATE (c3)
print 'Hey, column c3 has been just updated!'


Of course, it's up to your imagination, (or your project requirements) what you can do with this information. You can audit change of this column, you can trigger some actions that have to happen when this column is updated, eventually you can calculate some other values (possibly in other tables) based on new value of the column. You can also rollback the transaction if the new value of column doesn't meet business or integrity requirements.

 

The drawback of this method is that you have to check condition for every column and you have to deal with column names and for a hundred columns in a table it is really not a feasible option. This is where COLUMNS_UPDATED functions comes in rescue.
The COLUMNS_UPDATED returns a bitmap where bits set to 1 indicate changed columns, while these set to 0 mean that the columns were not changed. It is worth to mention here that 'changed' doesn't mean that they values were changed, it means that columns where on the left side of assignment operator in update statement. So this statement
update testtbl set col2 = col2
will set the bit mask for the column col2 to 1 even though the actual column's value hasn't changed.

The COLUMNS_UPDATED function returns varbinary type string (is there better word?). Its length depends on the number of columns that are defined in the table. Each byte will contain mask for eight consecutive columns, and the last byte will keep mask for the remaining columns. For up to 8 columns there is going to be 1 byte, for 12 columns 2 bytes, for 17 columns - 3 bytes.
If you look at the above numbers, you can see, that a formula for the number of bytes is ((@n - 1) / 8 + 1), where @n is max column_id in the table.

select  
((7 - 1) / 8 + 1) [bytes for 7 columns],
((12 - 1) / 8 + 1) [bytes for 12 columns],
((17 - 1) / 8 + 1) [bytes for 17 columns]

You can get all columns from information_schema.columns view, but this view will not return column_id, but ordinal position for these columns. To get column_id values you can use sys.columns table:

select name, column_id from sys.columns where object_id = object_id('testtbl')

Alternatively you can use COLUMNPROPERTY function:

select ordinal_position, column_name, 
COLUMNPROPERTY(object_id(table_name), column_name, 'ColumnId') column_id
from information_schema.columns

The COLUMNS_UPDATED return value is curiously arranged in a rather counter-intuitive way, at least this was my first impression. The value is as I said divided into bytes, and each byte contains bit mask for eight consecutive columns. The catch is that bytes are ordered left to right:

1stbyte->2ndbyte->3rdbyte..., while in bytes, bits are ordered right to left, with least significant bit on the right (this is standard binary number representation).

You end up with something like this:
[{8}, {7}, {6}, {5}, {4}, {3}, {2}, {1}][{16}, {15}, {14}, {13}, {12}, {11}, {10}, {9}][{24}, {23}, {22}, {21}, {20}, {19}, {18}, {17}]
The square brackets represent bytes, numbers in curly braces represent column_id values.

What can you do with this bitmap? If you want to check if particular column was updated, first you have to select the byte relevant for this column and then check if the flag for the column is set to 1 or 0. This sounds complicated, but it isn’t.

You can use SUBSTRING function to select only one byte of the mask, like this:

select SUBSTRING(COLUMNS_UPDATED(), 2, 1) 

The above statement returns second byte in the mask. Of course, you wouldn’t want to work with hardcoded values, but the formula for the byte number of a particular column is same as in case of calculating length of the whole mask: just take column_id of the column, subtract 1, divide by 8 and add 1 to the result.

select SUBSTRING(COLUMNS_UPDATED(), ((@col_id - 1) / 8 + 1), 1)

Now, we have our byte, we need to extract the bit value for the column. To calculate position of the bit within selected byte, you have to subtract 1 from the column_id, modulo divide the result by 8 and add 1 to the result of the division.

@bit_position = (@col_id-1)%8 + 1

The modulo operation returns the reminder of the division. So for col_id 5 it will be 5, for 12 it will be 4, for 17 the operation will return 1:

select 
(5-1)%8 + 1 [5th bit in 1st byte],
(12-1)%8 + 1 [4th bit on 2nd byte],
(17-1)%8 + 1 [1st bit in 3rd byte]

Now, when we have position, we can calculate mask that we will use in turn to extract the bit indicating if the column was updated or not. There is another formula, and this is the last one in this post: to get the mask value, you have to raise 2 to the power equal to bit position minus 1. So the formula looks like this:

@mask = power(2, @bit_position – 1)

If we replace @bit_position with the formula for the bit position, we get

@mask = power(2, (@col_id-1)%8 + 1 – 1) = power(2, (@col_id-1)%8)

select 
(5-1)%8 + 1 [5th bit in 1st byte],
power(2, (5-1)%8) [mask for the bit],
(12-1)%8 + 1 [4th bit on 2nd byte],
power(2, (12-1)%8) [mask for the bit],
(17-1)%8 + 1 [1st bit in 3rd byte],
power(2, (17-1)%8) [mask for the bit]

Ok, let’s wrap it up. We know all the formulas required to get updated column information from the bitmap, how to use this in a trigger? Many developers use in this case a loop that calculates the byte, mask and retrieves name of the updated column. I am not going to do this.

Some time ago I wrote about numbers table, and you can find there links to clever and very useful implementations of this technique. You can also apply this technique to COLUMNS_UPDATED bitmap. Here’s one approach:


if exists(select 1 from sys.triggers where parent_id = object_id('[dbo].[tManyColumns]')
and name = 'trg1')
drop trigger trg1

go

create
trigger trg1 on [dbo].[tManyColumns]
for insert, update, delete
as
begin
declare @table_id int

select @table_id = parent_id from sys.triggers where object_id = @@procid

select columns_updated() [columns_updated string]

;with column_updated_bytesCTE --divide bitmask into bytes. this query requires number table Admin.tTally in the database
as
(
select number ByteNumber, convert(binary(1),
substring(columns_updated(), number, 1)) [ByteValue]
from Admin.tTally
where number <= datalength(columns_updated())
),
columnsCTE as --return columns belonging to table @TableObjectId, calculate appropriate bit masks
(
select column_id, [name] column_name, ByteNumber, ByteValue,
power(2, (((a.column_id - 1 ) % 8) + 1) - 1) BitMask
from sys.columns a inner join
column_updated_bytesCTE b
on ((a.column_id - 1) / 8) + 1 = b.ByteNumber
where a.object_id = @table_id
)
select *,
case when ByteValue & BitMask > 0 then 'yes' else 'no' end [Is column updated?] from columnsCTE
end

go

This small example returns list of columns belonging to the table and indicates which columns have been updated. It also returns the value of columns_updated bitmask, so you can check that particular bytes in it are equal to ByteValue returned by the last query in the trigger. Here’s how you can test the trigger:

update dbo.tManyColumns set c10 = 1 where c1 = 1 and c2 = 4

Here’s output you will see in the results pane:

columns_updated string
--------------------------
0x000200

(1 row(s) affected)

column_id   column_name     ByteNumber  ByteValue BitMask     Is column updated?
----------- --------------- ----------- --------- ----------- ------------------
1           c1              1           0x00      1           no
2           c2              1           0x00      2           no
3           c3              1           0x00      4           no
4           c4              1           0x00      8           no
5           c5              1           0x00      16          no
6           c6              1           0x00      32          no
7           c7              1           0x00      64          no
8           c8              1           0x00      128         no
9           c9              2           0x02      1           no
10          c10             2           0x02      2           yes
12          c12             2           0x02      8           no
13          c13             2           0x02      16          no
14          c14             2           0x02      32          no
15          c15             2           0x02      64          no
16          c16             2           0x02      128         no
17          c17             3           0x00      1           no
18          c18             3           0x00      2           no
19          c19             3           0x00      4           no

(18 row(s) affected)

As you see, column C10, which has column_id belonging to the second byte is correctly indicated as updated. The ByteValue for this byte is 0x02, which corresponds to second position of the column (from the right) within the byte. In the first resultset returned by the trigger you can see that columns_updated bitmap consists of three bytes, and second byte (from the left) is 0x02.

Given this information, it is pretty easy to generate dynamic code that will insert changed data to an audit table. However, this is topic for another post. Stay tuned :)

Published 28 April 2010 23:27 by Piotr Rodak

Comments

# Twitter Trackbacks for COLUMNS_UPDATED() for audit triggers - Piotr Rodak [sqlblogcasts.com] on Topsy.com

Pingback from  Twitter Trackbacks for                 COLUMNS_UPDATED() for audit triggers - Piotr Rodak         [sqlblogcasts.com]        on Topsy.com

# The penultimate audit trigger framework

So, it’s time to see what I came up with after some time of playing with COLUMNS_UPDATED() and bitmasks

11 May 2010 00:28 by Piotr Rodak