30 March 2007 11:51 tonyrogerson

Blocking from Foreign Key Constraint look ups - Clustered (blocks) v Heap (doesn't block)

When using foreign key constraints watch out for blocking problems – remember for each reference clause and row you are inserting or updating (if the column you are updating is part of a foreign key reference) SQL Server will do a look up query to make sure the corresponding row exists in the referenced (parent) table.

An interesting behaviour came out of the wood work this week, I’m of the camp that every table should have a clustered index hence my surprise at a blocking problem I had with a client this week.

When the referenced table is a heap my query is fine and I get no blocking, change the reference column to be a clustered index and you get blocking, let’s have a look at the behaviour...

Heap (a table without a clustered index)

create table parent (

    id  int not null constraint sk_parent unique nonclustered,

 

    somedata varchar(20) not null

)

go

 

create table child (

 

    parent_id int not null references parent( id )

 

)

go

 

--  Make a row exist in the Parent...

INSERT parent ( id, somedata ) values( 1, 'NATKEY' )

 

Now to our test, run this on connection A..

BEGIN TRAN

 

UPDATE parent

    SET somedata = '12345'

where id = 1

 

Now, run this on connection B...

BEGIN TRAN

 

INSERT child ( parent_id ) VALUES( 1 )

 

All is fine and dandy, there is no blocking.

Clustered Index

create table parent (

    id  int not null constraint sk_parent unique CLUSTERED,

 

    somedata varchar(20) not null

)

go

 

create table child (

 

    parent_id int not null references parent( id )

 

)

go

 

--  Make a row exist in the Parent...

INSERT parent ( id, somedata ) values( 1, 'NATKEY' )

 

Now to our test, run this on connection A..

BEGIN TRAN

 

UPDATE parent

    SET somedata = '12345'

where id = 1

 

Now, run this on connection B...

BEGIN TRAN

 

INSERT child ( parent_id ) VALUES( 1 )

 

Interestingly connection B now blocks, but why? To answer that question we need to see what locks are being held and on what, we also need to understand more on how a heap and table with a clustered index is structured.

First though, a quick summary of Locking:

Key for Mode column

S  :: Shared Lock           I :: Intent          U :: Update Lock           X :: Exclusive

Key for TextData column

[PLANGUIDE]                Execution plan for the statement
1:41                              <file number>:<page number in database>
(010086470766) Hash of the key value in the index

Locks held for Heap method

Profiler Output for HEAP

 

The first thing we see from our profiler trace is that there are 3 data pages involved in this bit of work – pages 41, 80 and 120. Let’s take a look at what these pages are, to do this we use DBCC PAGE which is very well explained in Paul Randal’s Storage Engine blog here: http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/10/625659.aspx.

First we enable output to our query window:

dbcc traceon ( 3604 )

Next we take a look at the first page, page 41:

dbcc page( 'concurrency', 1, 41, 3 )


Couple of things here, the output gives us a result set which shows the KeyHashValue – look familiar? It’s in the profiler trace TextData column too.

FileId PageId      Row    Level  id (key)    HEAP RID           KeyHashValue
------ ----------- ------ ------ ----------- ------------------ ----------------
1      41          0      0      1           0x7800000001000000 (010086470766)

PAGE: (1:41)

BUFFER:

BUF @0x0339A0FC

bpage = 0x9693A000                   bhash = 0x00000000                   bpageno = (1:41)
bdbid = 27                           breferences = 0                      bUse1 = 47401
bstat = 0x1c0000b                    blog = 0xbbbb79bb                    bnext = 0x00000000

PAGE HEADER:

Page @0x9693A000

m_pageId = (1:41)                    m_headerVersion = 1                  m_type = 2
m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0xc000
m_objId (AllocUnitId.idObj) = 103    m_indexId (AllocUnitId.idInd) = 256 
Metadata: AllocUnitId = 72057594044678144                                
Metadata: PartitionId = 72057594039697408                                 Metadata: IndexId = 2
Metadata: ObjectId = 405576483       m_prevPage = (0:0)                   m_nextPage = (0:0)
pminlen = 13                         m_slotCnt = 1                        m_freeCnt = 8081
m_freeData = 109                     m_reservedCnt = 0                    m_lsn = (25:116:36)
m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
m_tornBits = 0                      

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED              
PFS (1:1) = 0x60 MIXED_EXT ALLOCATED   0_PCT_FULL                         DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED           

The two important bits of information that I draw your attention to is Metadata: IndexId and Metadata: ObjectId, IndexId of 2 is our non-clustered index; ObjectId: 405576483 is the table ‘parent’.

Let’s now have a look at page 120..

dbcc page( 'concurrency', 1, 120, 3 )

PAGE: (1:120)

 

 

BUFFER:

 

 

BUF @0x033B68FC

 

bpage = 0x9753A000                   bhash = 0x00000000                   bpageno = (1:120)

bdbid = 27                           breferences = 0                      bUse1 = 47665

bstat = 0x1c0000b                    blog = 0x9bbb79bb                    bnext = 0x00000000

 

PAGE HEADER:

 

 

Page @0x9753A000

 

m_pageId = (1:120)                   m_headerVersion = 1                  m_type = 1

m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0x8008

m_objId (AllocUnitId.idObj) = 102    m_indexId (AllocUnitId.idInd) = 256 

Metadata: AllocUnitId = 72057594044612608                                

Metadata: PartitionId = 72057594039631872                                 Metadata: IndexId = 0

Metadata: ObjectId = 405576483       m_prevPage = (0:0)                   m_nextPage = (0:0)

pminlen = 8                          m_slotCnt = 1                        m_freeCnt = 8073

m_freeData = 142                     m_reservedCnt = 0                    m_lsn = (25:133:57)

m_xactReserved = 0                   m_xdesId = (0:708)                   m_ghostRecCnt = 0

m_tornBits = 0                      

 

Allocation Status

 

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED              

PFS (1:1) = 0x61 MIXED_EXT ALLOCATED  50_PCT_FULL                         DIFF (1:6) = CHANGED

ML (1:7) = NOT MIN_LOGGED           

 

Slot 0 Offset 0x60 Length 21

 

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS

 

Memory Dump @0x0858C060

 

00000000:   30000800 01000000 0200fc01 0015004e †0..............N        

00000010:   41544b45 59††††††††††††††††††††††††††ATKEY                   

 

Slot 0 Column 0 Offset 0x4 Length 4

 

id = 1                              

 

Slot 0 Column 1 Offset 0xf Length 6

 

somedata = NATKEY    

As can be seen from IndexId and ObjectId, the page belongs to the ‘parent’ table and is a data page – it has an IndexId of 0 which means its part of a heap. If we did the same for page 80 we would find that page 80 is a data page for the table ‘child’, there is no corresponding index page because the table ‘child’ has no indexes defined.

Let’s summarise this, SPID 77 has an Update Lock on the Index page and corresponding key on that page, it has an Exclusive lock on the data page 120 and Record ID we are updating.

The important thing to take away is that the Update lock will not block a reader (readcommittedlock). Let’s prove this...

Making sure you have rolled back the transactions from the previous example (check by printing @@TRANCOUNT) and then cleared the profiler trace we can do a test..

In Connection A run this...

begin tran

 

select *

from parent with ( updlock )

In Connection B run this...

select *

from parent

The trace will look like this...

Profiler Trace showing behaviour of FK lookup 

You can see that on SPID 77 the Update lock is applied which prevents anybody else from updating this row, SPID 52 freely reads the row. Note – this simulates the behaviour against the index page data and not the data page, obviously if you update the data on the data page then the reader will block which is what we would see if using UPDATE instead of the SELECT ... WITH ( UPDLOCK ). As an aside to make SPID 52 block we would of used UPDLOCKX which is what happens when you use UPDATE, as another aside - ignore the hint READCOMMITTEDLOCK which is shown above, that lock is the default behaviour in SQL Server anyway.

Lost me yet? Done with the heap, so let’s look at what happens with a table with a clustered index instead.

Our next script deals with the clustered index, having rolled back the two transactions in the test query windows, run this...

drop table child

drop table parent

go

 

create table parent (

    id  int not null constraint sk_parent unique clustered,

 

    somedata varchar(20) not null

)

go

 

create table child (

 

    parent_id int not null references parent( id )

 

)

go

 

--  Make a row exist in the Parent...

INSERT parent ( id, somedata ) values( 1, 'NATKEY' )

In Connection A run this...

BEGIN TRAN

 

UPDATE parent

    SET somedata = '12345'

where id = 1

In Connection B run this...

BEGIN TRAN

 

INSERT child ( parent_id ) VALUES( 1 )

Connection B now blocks, but why? You are only INSERT’ing into the table ‘child’! The foreign key constraint is checking that the parent_id you are inserting into ‘child’ actually exists in the table ‘parent’. Not exactly obvious what is blocking unless you dig around (through profiler, IO STATS and the query plan).

Profiler Trace and FK behaviour for Clustered Index

Above is the new profiler trace, notice we have only 2 pages (109 and 120) in our locking condition.

Use DBCC PAGE again to determine what pages belong to what, you will see that (in my example) page 109 belongs to the table ‘parent’ and has an IndexId of 1, 120 has an IndexId of 0 and belongs to the table ‘child’.

We are actually blocking because of the exclusive row lock (7 – KEY) on page 109 the parent table – why?

The difference between a table with a clustered index on it and one without (a heap) is that the data row is part of the clustered index, the data row is at the bottom (leaf) level of the index hence the reason we only ‘use’ or ‘lock’ the single page, page 109 in the clustered index example above, whereas in a heap the data row and index are separate hence we have two pages – page 41 and 120.

The reason the heap does not block is because the data on page 41 (the index page) is not being updated (another good reason to use surrogate keys – but that’s another blog entry) so SQL Server can just use the Intent Update lock to state you can read it but don’t you dare update it (or delete).

That’s the behaviour, and it’s the default behaviour within SQL Server; I always recommend using a clustered index because of benefits elsewhere, so – how do we get round this blocking behaviour?

Couple of methods, the first is to disable (not remove) your foreign key constraints, this means that the lookup query is not performed, however, it does open your database to missing data; however – should that ever happen? If you thoroughly QA your application with the constraints on then any possible breach of foreign key will have been trapped and the argument could be successfully put to disable the constraints, well – on a closed system i.e. one that accepts no data imports (I suppose you could do checks there, also, there is no reason not to reenable constraints when loading data – it’s just the hit you take while SQL Server checks existing data, which can be got round).

I’m surprised setting the database options READ_COMMITTED_SNAPSHOT nor ALLOW_SNAPSHOT_ISOLATION do not help. I must admit, I did expect one of these options to ‘fix’ the behaviour and give the last good committed value – but no - the INSERT still blocks.

Ok, more on the READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION that Erland Sommarskog just pointed out - these database options allow you to read old data, for instance READ_COMMITTED_SNAPSHOT would give the last committed value; for referential integrity that is no good - it could allow rows in that the other connection has just deleted! In the case of the Heap, if you Connection B had done a DELETE or UPDATE'd the referenced column then the Heap would block as well.

So, I guess the only way round this is to disable the constraints - an option if your system is suffering from blocking or excessive reads caused by FK lookups.

Also, check out Hugo Kornelis blog entry on the FK and SNAPSHOT ISOLATION -> http://sqlblog.com/blogs/hugo_kornelis/archive/2006/07/21/Snapshot-and-integrity-part-1.aspx

 

Filed under:

Comments

# re: Blocking from Foreign Key Constraint look ups - Clustered (blocks) v Heap (doesn't block)

30 March 2007 22:23 by Hugo Kornelis

Disabling constraints to reduce blocking can indeed be a good idea - but make sure to include the WITH CHECK option when re-enabling the constraint.

See http://sqlblog.com/blogs/hugo_kornelis/archive/2007/03/29/can-you-trust-your-constraints.aspx

# re: Blocking from Foreign Key Constraint look ups - Clustered (blocks) v Heap (doesn't block)

02 April 2007 14:57 by DamianMulvena

Thanks for explaining this Tony. A useful account of what goes on for this type of interaction.

Not sure I agree with the principle of dropping a constraint when it causes a performance problem. Would you make the same suggestion if it was a primary key constraint giving the problem? I've often found that wherever there is a "theoretical constraint" such as you suggest, that there is an integrity violation to accompany it.

From the awareness of the problem you have provided, a couple of other approaches could be taken (if this locking is a problem): Ensuring (through application design principles) that the application did not hold parent records in long transactions would be a useful response. Perhaps too you could vertically partition the parent, keeping stable identification data separate from volatile attributes of the parent.

# Hilfe: Sporadische Performance Probleme + Expertensuche | hilpers

Pingback from  Hilfe: Sporadische Performance Probleme + Expertensuche | hilpers

# Locking problem Master/Detail table | keyongtech

Pingback from  Locking problem Master/Detail table | keyongtech