23 January 2010 10:36
tonyrogerson
How to create a One to One relationship in SQL Server using DRI, Triggers and Views
It's beyond me to create a one to one relationship in SQL Server if you are inserting directly into the related tables; in this article I discuss just how you go about actually implementing a workable one to one relationship. Please feel free to comment or send me email to tonyrogerson@sqlserverfaq.com to discuss.
First of all let’s define what we mean by a “one to one relationship”.
A relationship is the definition of two “relations” (tables) having some logical data relationship or dependency such as one to one, one to zero or many, one to many etc.
The one to one relationship means that there must be at least one row in the left table and one corresponding row in the right table, a good definition can be found here: http://folkworm.ceri.memphis.edu/ew/SCHEMA_DOC/comparison/erd.htm (Understanding Entity Relationship Diagrams – section “relationships”).
In SQL Server terms that would be modelled something like below...
create table Customers (
CustomerSurrogateID int not null identity
constraint pk_Customers primary key clustered,
CustomerID char(5) not null
constraint uk_Customer_CustomerID unique,
Name varchar(100)
)
create table CustomersDetail (
CustomerSurrogateID int not null
constraint pk_CustomersDetail primary key clustered
constraint fk_CustomersDetail_Customers foreign key references Customers( CustomerSurrogateID ),
InitialOrderDate datetime not null
)
go
If you tried to create a foreign key reference between Customers and CustomersDetail actually in the CREATE TABLE statement you will have noticed that it fails because you haven’t created the CustomersDetail table – this eludes to the problem we are facing when it comes to inserting the data.
-- Can only add the foreign key reference here...
alter table Customers add constraint fk_Customers_CustomersDetail foreign key ( CustomerSurrogateID ) references CustomersDetail( CustomerSurrogateID )
go
In the schema above I use a surrogate key on Customers, a surrogate key being an automatically generated value that is part of the logical model and an attribute on the table but has no meaning within the business, it can be exposed to the application developer because it is stable and never changes, I mention that because often there is confusion between a surrogate key and a tuple id (row id), the difference is that the surrogate key never changes whereas the tuple id will change as deletes and inserts occur in the table, anyway, you can’t get access to the tuple-id in SQL Server.
Ok, we have our schema so what happens when we try and insert into the two tables?
We start our transaction so we keep consistency on the inserts into the two tables:
begin tran
We now do the insert:
insert Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )
What happened? Well, we get the error message below:
Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "fk_Customers_CustomersDetail". The conflict occurred in database "MScBI2010", table "dbo.CustomersDetail", column 'CustomerSurrogateID'.
The statement has been terminated.
Why? SQL Server is right to barf; its barf’d because when it checks for a corresponding row in the CustomersDetail table there isn’t one. Now think – is it a simple case of inserting the row in the CustomersDetail table first? Well no, that’s because it’s exactly the same circumstance – there would need to be a corresponding row in the Customers table – basically you are stuck and our use of the FOREIGN KEY constraint for RI has failed us, well – failed us in the practical sense – but it did do its job!
Moving on, why do we have the one to one relationship in the first place? Why not just merge the two sets of columns into a single table and be done with it? Logically the split may make sense, also from an implementation point of view you may want to do it to reduce locking contention thus increasing concurrency.
Worse still, what happens when we execute the above in a transaction? Would you expect the constraint violation to cause the transaction to rollback?
begin tran
insert Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )
select [count(*)] = ( select COUNT(*) from Customers )
select [scope_identity] = scope_identity() -- what is the surrogate value?
select [@@trancount] = @@TRANCOUNT -- did it rollback?
rollback
Msg 547, Level 16, State 0, Line 3
The INSERT statement conflicted with the FOREIGN KEY constraint "fk_Customers_CustomersDetail". The conflict occurred in database "MScBI2010", table "dbo.CustomersDetail", column 'CustomerSurrogateID'.
The statement has been terminated.
count(*)
-----------
0
scope_identity
---------------------------------------
NULL
@@trancount
-----------
1
Two things have happened; firstly the row did not get inserted – the clue there is the result “the statement has been terminated”, remember “statement”; but the @@trancount proves we are still in the original outer transaction.
Realistically (<sarcasm>and we all do this don’t we</>) we should use BEGIN TRY / CATCH to trap the error – example below:
begin tran
begin try
insert Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )
end try
begin catch
rollback tran
print 'rolled back'
end catch
if @@TRANCOUNT > 0
begin
commit tran
print 'committed'
end
Moving forward I see the two tables logically connected and as such I think we can model a view across the pair of them in the Physical Model, the view is below:
create view vw_CustomersCustomerDetails
with schemabinding
as
select cCustomerID = c.CustomerID, cCustomerSurrogateID = c.CustomerSurrogateID, cName = c.Name,
cdCUstomerSurrogateID = cd.CustomerSurrogateID,
cdDummyInsert = cd.DummyInsert,
cdInitialOrderDate = cd.InitialOrderDate,
cdSomeUniqueColReqd = cd.SomeUniqueColReqd
from dbo.Customers c
inner join dbo.CustomersDetail cd on cd.CustomerSurrogateID = c.CustomerSurrogateID
go
We pull all the columns from both tables and use an INNER JOIN to form the relationship. The naming of the columns needs thinking about and is a personal choice, from my experience I’d say that you ought to name them firstly prefixing with the source table then source column name for example Customers_CustomerID.
We can now insert into the view:
insert vw_CustomersCustomerDetails ( cCustomerID ) values ( 1 )
(1 row(s) affected)
Let’s look at our row:
select * from vw_CustomersCustomerDetails
cCustomerID cCustomerSurrogateID cName cdCUstomerSurrogateID cdDummyInsert cdInitialOrderDate cdSomeUniqueColReqd
----------- -------------------- ---------------------------------------------------------------------------------------------------- --------------------- ------------- ----------------------- -------------------
(0 row(s) affected)
Interestingly although we did the insert, got one row affected we do not have in rows returned.
Try the insert again...
insert vw_CustomersCustomerDetails ( cCustomerID ) values ( 1 )
Msg 2627, Level 14, State 1, Line 1
Violation of UNIQUE KEY constraint 'uk_Customer_CustomerID'. Cannot insert duplicate key in object 'dbo.Customers'.
The statement has been terminated.
Lol, we have 0 rows returned from querying vw_Customer sCustomerDetails yet we get a violation of the unique key on Customers.
Ok, something really daft is going on here and it’s breaking Codds Rule 6 (http://www.databaseanswers.org/codds_rules.htm); the insert should not work because the View is not updating both tables; this is simply a side effect of the way that the query optimiser expands the View SQL into the main body of the query which I’ve covered on my blog here: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/01/03/views-they-offer-no-optimisation-benefits-they-are-simply-inline-macros-use-sparingly.aspx.
We can prevent the engine from doing this silliness by using what is called an INSTEAD OF trigger, there are two types of trigger an INSTEAD OF which is before the update/insert/delete takes place and puts you in control of what will happen because that is the nature of that feature – you are proving logic “instead of” the original dml statement and the after trigger which is post update/insert/delete.
create trigger trg_CustomersCustomerDetails_instof
on vw_CustomersCustomerDetails instead of insert
as
begin
-- First insert the Customer, the trigger on Customer
-- will take care of inserting the CustomerDetail dummy record.
insert Customers ( CustomerID, Name )
select cCustomerID, cName
from inserted
-- Now update the dummy record with the CustomersDetail
update CustomersDetail
set DummyInsert = NULL, -- we now have real data
InitialOrderDate = i.cdInitialOrderDate,
SomeUniqueColReqd = i.cdSomeUniqueColReqd
from inserted i
inner join Customers c on c.CustomerSurrogateID = i.cdCustomerSurrogateID
end
go
The above creates our INSTEAD OF trigger thus intercepting the INSERT on the view; we must do the “Insert” processing ourselves, the rows inserted into the view are contained within the system generated “inserted” table.
We are little out of sequence so let’s get back to the solution; we are basically going to use a dummy row in the CustomersDetail table, I can hear people drawing breath already in terms of RI – how can we enforce uniqueness etc. I’ve thought about that and will show you later on.
We need to modify our “right” table and make all the columns except our key Nullable. We also implement our CHECK constraints with slightly difference logic as you can see on the CustomersDetail table and the InitialOrderDate. The CHECK CONSTRAINT must be a table level CHECK CONSTRAINT otherwise bets are off eg.
create table Customers (
CustomerSurrogateID int not null identity
constraint pk_Customers primary key clustered,
CustomerID char(5) not null
constraint uk_Customer_CustomerID unique,
Name varchar(100)
)
create table CustomersDetail (
CustomerSurrogateID int not null
constraint pk_CustomersDetail primary key clustered
constraint fk_CustomersDetail_Customers foreign key references Customers( CustomerSurrogateID ),
DummyInsert char(1) null
constraint uk_CustomersDetail check( coalesce( DummyInsert, 'N' ) in ( 'Y', 'N' ) ),
InitialOrderDate datetime null,
SomeUniqueColReqd int null,
check( ISNULL( DummyInsert, 'N' ) = 'Y' or ( DummyInsert IS NULL and InitialOrderDate IS NOT NULL ) )
)
go
If you’ve blatted the tables remember to recreate the trigger:
create trigger trg_CustomersCustomerDetails_instof
on vw_CustomersCustomerDetails instead of insert
as
begin
-- First insert the Customer, the trigger on Customer
-- will take care of inserting the CustomerDetail dummy record.
insert Customers ( CustomerID, Name )
select cCustomerID, cName
from inserted
-- Now update the dummy record with the CustomersDetail
update CustomersDetail
set DummyInsert = NULL, -- we now have real data
InitialOrderDate = i.cdInitialOrderDate,
SomeUniqueColReqd = i.cdSomeUniqueColReqd
from inserted i
inner join Customers c on c.CustomerSurrogateID = i.cdCustomerSurrogateID
end
go
The above is not enough for our solution to work properly; inserting into the view is fine now but we need to have trigger logic on both the Customers and CustomersDetail tables.
create trigger trg_Customers_ins on Customers for insert
as
begin
-- Make sure there is a row in the table we have a one to one with.
-- It cannot contain data.
insert CustomersDetail ( CustomerSurrogateID, DummyInsert )
select CustomerSurrogateID, 'Y'
from inserted
end
go
create trigger trg_CustomersDetail_del on CustomersDetail for delete
as
begin
-- Need to remove the corresponding Customers row to keep consistency
delete Customers
where CustomerSurrogateID in ( select CustomerSurrogateID
from deleted )
end
go
The solution basically inserts a dummy row into the CustomersDetail table – horrible I know, but we incorporate a DummyInsert column that is set to Y whenever that row is a Dummy; that then allows us to still use CHECK constraints; UNIQUE constraints are implemented using a unique filtered index such as below:
create unique index fltd_CustomersDetail_InitialOrderDate
on CustomersDetail( SomeUniqueColReqd )
where DummyInsert IS NULL -- Ignore dummy inserts because of one to one relationship
go
In summary; implementing a true one to one relationship in SQL Server is far from easy but I’ve shown you “a way” of doing it. The method uses the dummy row approach which is the only way that would work, but ordinarily if you just use a trigger on the left table to insert the dummy row there is nothing forcing the application developer to remember to go on and update the columns in the right table – inserting through the view forces that; lastly – the use of the DummyInsert column enables you to properly enforce constraints on columns that really do allow Nulls. On the columns that are logically NOT NULL you can do a CHECK CONSTRAINT similar to CHECK ( ( DummyInsert IS NULL AND {your col} IS NOT NULL) OR DummyInsert IS NULL).
UPDATE: 2010-01-23T14:20:00
Been thinking further on this and there are couple of holes that need to be plugged, one of them is updating the surrogate key on the CustomerOrders table - that needs "something" on there to make it insert only and not updateable - any suggestions, hint: something in an update trigger :)
UPDATE: 2010-01-28T14:39:00
Talking in the private groups with Adam Machanic he rightly points out we need deferrable constraints that operate when the final transaction commits which is right - never thought about that, anyway checking the ISO standard its in there....
"The checking of a constraint depends on its constraint mode within the current SQL-transaction. If the constraint mode is immediate, then the constraint is effectively checked at the end of each SQL-statement. If the constraint mode is deferred, then the constraint is effectively checked when the constraint mode is changed to immediate either explicitly by execution of a <set constraints mode statement>, or implicitly at the end of the current SQL-transaction."
We don't have deferrable constraints in SQL Server unfortunetly.
Filed under: SQL Referential Integrity