January 2010 - Posts

It's beyond me to create a one to one relationship in SQL Server if you are inserting directly into the related tables; in this article I discuss just how you go about actually implementing a workable one to one relationship. Please feel free to comment or send me email to tonyrogerson@sqlserverfaq.com to discuss.

 

First of all let’s define what we mean by a “one to one relationship”.

A relationship is the definition of two “relations” (tables) having some logical data relationship or dependency such as one to one, one to zero or many, one to many etc.

The one to one relationship means that there must be at least one row in the left table and one corresponding row in the right table, a good definition can be found here: http://folkworm.ceri.memphis.edu/ew/SCHEMA_DOC/comparison/erd.htm (Understanding Entity Relationship Diagrams – section “relationships”).


In SQL Server terms that would be modelled something like below...


create
table Customers (


 

       CustomerSurrogateID  int    not null      identity     

              constraint pk_Customers primary key clustered,

      

       CustomerID           char(5)       not null

              constraint uk_Customer_CustomerID unique,

             

       Name   varchar(100)

       )

      

      

create table CustomersDetail (


 

       CustomerSurrogateID  int    not null

              constraint pk_CustomersDetail primary key clustered

              constraint fk_CustomersDetail_Customers foreign key references Customers( CustomerSurrogateID ),

             

       InitialOrderDate     datetime not null

       )

go


 

If you tried to create a foreign key reference between Customers and CustomersDetail actually in the CREATE TABLE statement you will have noticed that it fails because you haven’t created the CustomersDetail table – this eludes to the problem we are facing when it comes to inserting the data.


 

--     Can only add the foreign key reference here...

alter table Customers add constraint fk_Customers_CustomersDetail foreign key ( CustomerSurrogateID ) references CustomersDetail( CustomerSurrogateID )

go


In the schema above I use a surrogate key on Customers, a surrogate key being an automatically generated value that is part of the logical model and an attribute on the table but has no meaning within the business, it can be exposed to the application developer because it is stable and never changes, I mention that because often there is confusion between a surrogate key and a tuple id (row id), the difference is that the surrogate key never changes whereas the tuple id will change as deletes and inserts occur in the table, anyway, you can’t get access to the tuple-id in SQL Server.


Ok, we have our schema so what happens when we try and insert into the two tables?


We start our transaction so we keep consistency on the inserts into the two tables:



begin
tran


We now do the insert:


insert
Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )


What happened? Well, we get the error message below:


Msg 547, Level 16, State 0, Line 1

The INSERT statement conflicted with the FOREIGN KEY constraint "fk_Customers_CustomersDetail". The conflict occurred in database "MScBI2010", table "dbo.CustomersDetail", column 'CustomerSurrogateID'.

The statement has been terminated.


 

Why? SQL Server is right to barf; its barf’d because when it checks for a corresponding row in the CustomersDetail table there isn’t one. Now think – is it a simple case of inserting the row in the CustomersDetail table first? Well no, that’s because it’s exactly the same circumstance – there would need to be a corresponding row in the Customers table – basically you are stuck and our use of the FOREIGN KEY constraint for RI has failed us, well – failed us in the practical sense – but it did do its job!


Moving on, why do we have the one to one relationship in the first place? Why not just merge the two sets of columns into a single table and be done with it? Logically the split may make sense, also from an implementation point of view you may want to do it to reduce locking contention thus increasing concurrency.


Worse still, what happens when we execute the above in a transaction? Would you expect the constraint violation to cause the transaction to rollback?



begin
tran


 

insert Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )


 

select [count(*)]       = ( select COUNT(*) from Customers )

select [scope_identity] = scope_identity()             --     what is the surrogate value?

select [@@trancount]    = @@TRANCOUNT                  --     did it rollback?


 

rollback


 

Msg 547, Level 16, State 0, Line 3

The INSERT statement conflicted with the FOREIGN KEY constraint "fk_Customers_CustomersDetail". The conflict occurred in database "MScBI2010", table "dbo.CustomersDetail", column 'CustomerSurrogateID'.

The statement has been terminated.


 

count(*)

-----------

0


 

scope_identity

---------------------------------------

NULL


 

@@trancount

-----------

1


Two things have happened; firstly the row did not get inserted – the clue there is the result “the statement has been terminated”, remember “statement”; but the @@trancount proves we are still in the original outer transaction.


Realistically (<sarcasm>and we all do this don’t we</>) we should use BEGIN TRY / CATCH to trap the error – example below:



begin
tran


 

begin try


 

       insert Customers ( CustomerID, Name ) values( 'abcde', 'Funky Parts Ltd' )


 

end try

begin catch


 

       rollback tran

       print 'rolled back'


 

end catch


 

if @@TRANCOUNT > 0

begin

       commit tran

       print 'committed'

end



Moving forward I see the two tables logically connected and as such I think we can model a view across the pair of them in the Physical Model, the view is below:



create
view vw_CustomersCustomerDetails

       with schemabinding

as

       select cCustomerID = c.CustomerID, cCustomerSurrogateID = c.CustomerSurrogateID, cName = c.Name,

                 cdCUstomerSurrogateID = cd.CustomerSurrogateID,

                 cdDummyInsert = cd.DummyInsert,

                 cdInitialOrderDate = cd.InitialOrderDate,

                 cdSomeUniqueColReqd = cd.SomeUniqueColReqd

       from dbo.Customers c

              inner join dbo.CustomersDetail cd on cd.CustomerSurrogateID = c.CustomerSurrogateID

go


 

We pull all the columns from both tables and use an INNER JOIN to form the relationship. The naming of the columns needs thinking about and is a personal choice, from my experience I’d say that you ought to name them firstly prefixing with the source table then source column name for example Customers_CustomerID.


We can now insert into the view:



insert
vw_CustomersCustomerDetails ( cCustomerID ) values ( 1 )


 

(1 row(s) affected)


Let’s look at our row:

select * from vw_CustomersCustomerDetails


cCustomerID cCustomerSurrogateID cName                                                                                                cdCUstomerSurrogateID cdDummyInsert cdInitialOrderDate      cdSomeUniqueColReqd

----------- -------------------- ---------------------------------------------------------------------------------------------------- --------------------- ------------- ----------------------- -------------------


 

(0 row(s) affected)


 

Interestingly although we did the insert, got one row affected we do not have in rows returned.


Try the insert again...


insert
vw_CustomersCustomerDetails ( cCustomerID ) values ( 1 )


Msg 2627, Level 14, State 1, Line 1

Violation of UNIQUE KEY constraint 'uk_Customer_CustomerID'. Cannot insert duplicate key in object 'dbo.Customers'.

The statement has been terminated.


Lol, we have 0 rows returned from querying vw_Customer sCustomerDetails yet we get a violation of the unique key on Customers.


Ok, something really daft is going on here and it’s breaking Codds Rule 6 (
http://www.databaseanswers.org/codds_rules.htm); the insert should not work because the View is not updating both tables; this is simply a side effect of the way that the query optimiser expands the View SQL into the main body of the query which I’ve covered on my blog here: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/01/03/views-they-offer-no-optimisation-benefits-they-are-simply-inline-macros-use-sparingly.aspx.


We can prevent the engine from doing this silliness by using what is called an INSTEAD OF trigger, there are two types of trigger an INSTEAD OF which is before the update/insert/delete takes place and puts you in control of what will happen because that is the nature of that feature – you are proving logic “instead of” the original dml statement and the after trigger which is post update/insert/delete.



create
trigger trg_CustomersCustomerDetails_instof

       on vw_CustomersCustomerDetails instead of insert

as

begin


 

       --     First insert the Customer, the trigger on Customer

       --     will take care of inserting the CustomerDetail dummy record.

       insert Customers ( CustomerID, Name )

              select cCustomerID, cName

              from inserted

             

       --     Now update the dummy record with the CustomersDetail

       update CustomersDetail

              set DummyInsert = NULL,           --     we now have real data

                     InitialOrderDate = i.cdInitialOrderDate,

                     SomeUniqueColReqd = i.cdSomeUniqueColReqd

       from inserted i

              inner join Customers c on c.CustomerSurrogateID = i.cdCustomerSurrogateID


 

end

go


The above creates our INSTEAD OF trigger thus intercepting the INSERT on the view; we must do the “Insert” processing ourselves, the rows inserted into the view are contained within the system generated “inserted” table.


We are little out of sequence so let’s get back to the solution; we are basically going to use a dummy row in the CustomersDetail table, I can hear people drawing breath already in terms of RI – how can we enforce uniqueness etc. I’ve thought about that and will show you later on.


We need to modify our “right” table and make all the columns except our key Nullable. We also implement our CHECK constraints with slightly difference logic as you can see on the CustomersDetail table and the InitialOrderDate. The CHECK CONSTRAINT must be a table level CHECK CONSTRAINT otherwise bets are off eg.



create
table Customers (


 

       CustomerSurrogateID  int    not null      identity     

              constraint pk_Customers primary key clustered,

      

       CustomerID           char(5)       not null

              constraint uk_Customer_CustomerID unique,

             

       Name   varchar(100)

       )

      

create table CustomersDetail (


 

       CustomerSurrogateID  int    not null

              constraint pk_CustomersDetail primary key clustered

              constraint fk_CustomersDetail_Customers foreign key references Customers( CustomerSurrogateID ),

      

       DummyInsert                char(1) null

              constraint uk_CustomersDetail check( coalesce( DummyInsert, 'N' ) in ( 'Y', 'N' ) ),

             

       InitialOrderDate     datetime null,

      

       SomeUniqueColReqd    int null,

      

              check( ISNULL( DummyInsert, 'N' ) = 'Y' or ( DummyInsert IS NULL and InitialOrderDate IS NOT NULL ) )


 

       )

go


 

If you’ve blatted the tables remember to recreate the trigger:



create
trigger trg_CustomersCustomerDetails_instof

       on vw_CustomersCustomerDetails instead of insert

as

begin


 

       --     First insert the Customer, the trigger on Customer

       --     will take care of inserting the CustomerDetail dummy record.

       insert Customers ( CustomerID, Name )

              select cCustomerID, cName

              from inserted

             

       --     Now update the dummy record with the CustomersDetail

       update CustomersDetail

              set DummyInsert = NULL,           --     we now have real data

                     InitialOrderDate = i.cdInitialOrderDate,

                     SomeUniqueColReqd = i.cdSomeUniqueColReqd

       from inserted i

              inner join Customers c on c.CustomerSurrogateID = i.cdCustomerSurrogateID


 

end

go


The above is not enough for our solution to work properly; inserting into the view is fine now but we need to have trigger logic on both the Customers and CustomersDetail tables.


 

create trigger trg_Customers_ins on Customers for insert

as

begin


 

       --     Make sure there is a row in the table we have a one to one with.

       --     It cannot contain data.

       insert CustomersDetail ( CustomerSurrogateID, DummyInsert )

              select CustomerSurrogateID, 'Y'

              from inserted


 

end

go


 

create trigger trg_CustomersDetail_del on CustomersDetail for delete

as

begin

       --     Need to remove the corresponding Customers row to keep consistency

      

       delete Customers

       where CustomerSurrogateID in ( select CustomerSurrogateID

                                      from deleted )


 

end

go


The solution basically inserts a dummy row into the CustomersDetail table – horrible I know, but we incorporate a DummyInsert column that is set to Y whenever that row is a Dummy; that then allows us to still use CHECK constraints; UNIQUE constraints are implemented using a unique filtered index such as below:



create
unique index fltd_CustomersDetail_InitialOrderDate

       on CustomersDetail( SomeUniqueColReqd )

       where DummyInsert IS NULL         --     Ignore dummy inserts because of one to one relationship

go


In summary; implementing a true one to one relationship in SQL Server is far from easy but I’ve shown you “a way” of doing it. The method uses the dummy row approach which is the only way that would work, but ordinarily if you just use a trigger on the left table to insert the dummy row there is nothing forcing the application developer to remember to go on and update the columns in the right table – inserting through the view forces that; lastly – the use of the DummyInsert column enables you to properly enforce constraints on columns that really do allow Nulls. On the columns that are logically NOT NULL you can do a CHECK CONSTRAINT similar to CHECK ( ( DummyInsert IS NULL AND {your col} IS NOT NULL) OR DummyInsert IS NULL).

UPDATE: 2010-01-23T14:20:00

Been thinking further on this and there are couple of holes that need to be plugged, one of them is updating the surrogate key on the CustomerOrders table - that needs "something" on there to make it insert only and not updateable - any suggestions, hint: something in an update trigger :)

UPDATE: 2010-01-28T14:39:00

Talking in the private groups with Adam Machanic he rightly points out we need deferrable constraints that operate when the final transaction commits which is right - never thought about that, anyway checking the ISO standard its in there....

"The checking of a constraint depends on its constraint mode within the current SQL-transaction. If the constraint mode is immediate, then the constraint is effectively checked at the end of each SQL-statement. If the constraint mode is deferred, then the constraint is effectively checked when the constraint mode is changed to immediate either explicitly by execution of a <set constraints mode statement>, or implicitly at the end of the current SQL-transaction."

We don't have deferrable constraints in SQL Server unfortunetly.

Allan Mitchell and I are starting our Business Intelligence Masters this week at the University of Dundee, as we are up here anyway it seemed a good idea to put on a user group meeting, so, details below - it's this Wednesday in the Queen Mothers Building from 6pm - 9pm.

You received this email because you are a member of the UK SQL Server user group, if you no longer want to receive these emails then just reply to me with the word remove in the subject - it's a manual task so please be patient if I missed you last time.

Please check the full and modified agenda here: http://sqlserverfaq.com?eid=211.

We are currently all talking about when and where to hold meetings, this will get formalised by the end of this month so watch this site :).

We are in the Queen Mothers Building; any problems my mobile is 0796 816 0362.

For directions on how to get their look at the map of campus found here: http://www.computing.dundee.ac.uk/travel.asp, it's literally a 10 minute walk from Dundee station.

Agenda

18:00 - 18:15 Meet & Greet

Meet up and socialise with your friends, meet new people, find out what other people are doing with SQL Server.

18:15 - 18:30 Round Table Discussion - Q & A from and to the audience

Bring your SQL problems and hopefully somebody within the room will be able to fix or advise you. There is always a wealth of experienced people at the user group meetings.

18:30 - 18:45 LiveLock: Always use IDENTITY for your keys

Two sides of the audience, one will argue FOR the statement and one will argue AGAINST.

18:45 - 19:30 SQL Server Tools

Tony Rogerson, SQL Server MVP

You've been tasked with looking after or developing against SQL Server - what next? How can I make sure my SQL is going to work in production and not grind the entire system to a halt?

I will take you through Management Studio and Profiler so you fully understand what is in there and how to use it; when showing query plans I will explain from an entry level how to read them and what some of the access paths and join types mean.

19:30 - 19:50 Break for Pizza

More time to network with your peers, meet new friends and get your questions answered.

19:50 - 21:00 Introduction to SQL Server Integration Services 2008

Allan Mitchell, SQL Server MVP

ETL? You've heard the word (or perhaps not); anyway - you want to know what this really powerful feature that comes with the SQL Server product set is.

Allan will show you some of the killer features in SSIS 2008, this will be an interactive session so bring your questions!

Many thanks,

Tony Rogerson, SQL Server MVP
http://sqlserverfaq.com
http://sqlblogcasts.com/blogs/tonyrogerson
http://twitter.com/tonyrogerson

Yes, copying between word and email always puts an extra line between code lines which makes the code hard to read.

The answer is simple; in Management Studio go into Find and Replace, in "Find Options" check "Use" and select "Regular Expressions".

Now, in the Find what: box enter the text ^\n

In the Replace with: box enter nothing, please don't take me literally on that and actually type the word nothing, I mean leave the text box empty.

Click Replace All.

Da dah!