This blog will no longer be updated.

New content is available on my new blog

on update cascade... - Piotr Rodak

on update cascade...

When you define a foreign key constraint, you can also decide what SQL Server should do when primary key to which the constraint references to gets deleted or updated. While I always knew and imagined that ON DELETE CASCADE may be useful, I wondered, what scenarios would be suitable for ON UPDATE CASCADE. I still don't have this answer, but I came across some interesting behavior which kept me occupied for quite a bit more time that I had intended to.

A few days ago I decided to implement additional logging to some of our reporting helper tables.These tables have primary key defined on pair of columns and one of these columns is incremented by our ETL to indicate the version that should be returned by views that expose data.

create schema Reports authorization dbo

go
if
object_id('Reports.ClientVersionMap') is not null
drop table Reports.ClientVersionMap
go
create
table Reports.ClientVersionMap
(
ClientID char(5) not null,
Version int not null,
ChangeDate datetime
)

alter table Reports.ClientVersionMap add constraint PK_Reports_ClientVersionMap_ClientId_Version
primary key clustered (ClientId, Version)

go

I took pretty simple approach. For each table (there were three) I created separate corresponding table in History schema. The definitions of the tables were pretty much identical, I just added a datetime column to timestamp the change.


create schema History authorization dbo

go
if
object_id('History.ClientVersionMap') is not null
drop table History.ClientVersionMap
go
create
table History.ClientVersionMap
(
ClientID char(5) not null,
Version int not null,
ChangeDate datetime,
HistoryChange datetime constraint DEF_HistoryChange default(current_timestamp)
)

alter table History.ClientVersionMap add constraint FK_ClientId_Version
foreign key(ClientId, Version) references Reports.ClientVersionMap(ClientId, Version)
on delete cascade on update cascade

go
create
clustered index IX_HistoryChange on History.ClientVersionMap(HistoryChange)
go
create
nonclustered index IX_ClientID_Version on History.ClientVersionMap(ClientId, Version)
go

Then I created a trigger to populate the History data. The trigger is created for insert and update actions, and it always inserts row or rows from inserted to History table.


create trigger Reports.trg_SaveClientVersionMap on Reports.ClientVersionMap
for insert, update
as
begin
insert History.ClientVersionMap(ClientId, Version, ChangeDate, HistoryChange)
select ClientId, Version, ChangeDate, current_timestamp
from inserted
end

Ok, let's see how it works.

insert Reports.ClientVersionMap(ClientId, Version, ChangeDate)
values ('AAAA', 1, current_timestamp)

select * from History.ClientVersionMap
ClientID Version     ChangeDate              HistoryChange
-------- ----------- ----------------------- -----------------------
AAAA 1 2009-09-25 00:25:28.813 2009-09-25 00:25:28.987

Ok, this looks as expected. I added one row to the Reports table and I've got one in the History table.

Now, what happens when I update existing row? I want to have two rows in the History table.

 

update Reports.ClientVersionMap set Version = Version + 1, ChangeDate = current_timestamp where ClientId = 'AAAA'

select * from History.ClientVersionMap

And indeed,  I have two:

ClientID Version     ChangeDate              HistoryChange
-------- ----------- ----------------------- -----------------------
AAAA 2 2009-09-25 00:25:28.813 2009-09-25 00:25:28.987
AAAA 2 2009-09-25 00:32:03.670 2009-09-25 00:32:03.670

But what happened? I was pretty surprised by this result. I should have one row with Version value of 1 and second row with 2. I thought I had clean quick solution, and all of a sudden a problem. I had more code in my script, some procs, view definition changes and so on. It took me a while (a bunch of print statements too) to see the reason of this behavior: on update cascade in the definition of the foreign key constraint of the History.ClientVersioMap table.

So, it works like this. You have primary key on table A and foreign key on table B that references table A with option ON UPDATE CASCADE. Whenever you modify the primary key on table A, the foreign key value is modified accordingly to reflect the change. In this way you don't have to bother about referential integrity in a way you might be used to. At least I am used. In my opinion you need to analyze carefully if putting this option to foreign key is safe (in a way). If you have it, basically nothing can stop you then from updating primary key and having the change propagated through referencing tables, possibly millions of rows. This can be pretty expensive experience. Also, it may not be a good idea to have mutable primary key - change of primary key in some cases may lead to fragmentation of index the key it is defining.

Definitely this option can be useful in some cases, but it has to be used with care, and not unintentionally, as it happened to me.

 

 

 

 

Published 24 September 2009 22:12 by Piotr Rodak

Comments

No Comments