10 November 2006 09:12
Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED
Using NOLOCK or READ UNCOMMITTED isolation (that’s what NOLOCK is) gives us a consistency problem. This entry came out of a conversation I had with Itzik Ben Gan when he was over a couple of weeks ago; essentially when you are using NOLOCK you can double read or completely miss rows.
The set up script:-
create table TestConsistency (
id int not null identity,
filler char(8000) not null default( 'tony rogerson' ),
somerandomness uniqueidentifier not null default( newid() )
set nocount on
declare @i int
set @i = 1
while @i <= 100000
insert TestConsistency ( filler ) default values
set @i = @i + 1
create unique clustered index clidx on TestConsistency( somerandomness )
Now we have our table we can do some testing; note I’ve made the example extreme just to highlight the behaviour is there, reproducing this type of thing on a real system is difficult but I want to make you aware it can happen.
Run the monitoring script below in Connection A, it basically keeps checking the row count as reported and prints out when there is a difference – we aren’t inserting nor deleting rows so logical it should always report 100,000 rows in the table.
declare @rows_in_table int
declare @rows_now int
declare @shows tinyint
set @shows = 0
set @rows_in_table = ( select count(*)
from TestConsistency )
while 1 = 1
waitfor delay '00:00:00.300'
set @rows_now = ( select count(*)
from TestConsistency WITH ( NOLOCK, INDEX = clidx ) )
if @rows_in_table <> @rows_now
print 'COUNT(*) produced ' + cast( @rows_now as varchar(10) ) + ' which is a difference of ' + cast( @rows_now - @rows_in_table as varchar(10) )
set @shows = @shows + 1
if @shows >= 10
In a separate connection we need to simulate update activity, this particular activity simulates the situation where you are updating columns that will cause the row to move or cause a page split – hence I’ve used NEWID() which is extreme J.
set somerandomness = newid()
Wait a while and you will get some results, on my box here I got the following output:-
COUNT(*) produced 99996 which is a difference of -4
COUNT(*) produced 99984 which is a difference of -16
COUNT(*) produced 99990 which is a difference of -10
COUNT(*) produced 99987 which is a difference of -13
COUNT(*) produced 99995 which is a difference of -5
COUNT(*) produced 99989 which is a difference of -11
COUNT(*) produced 100012 which is a difference of 12
COUNT(*) produced 99981 which is a difference of -19
COUNT(*) produced 99982 which is a difference of -18
COUNT(*) produced 99980 which is a difference of -20
Oh dear, look at that! We are missing or sometimes multi-counting rows. Imagine if I was aggregating a trade value or something – you’d have double, triple or even ‘x’led counted the rows – not good at all.
When you use NOLOCK or READ UNCOMMITTED you are telling SQL Server you are not interested in consistency and just want the data; I know most people use NOLOCK to get round the writer blocking reader problem, hell, I’ve told people to use it myself in the past, and, most of the time you will be alright, but that behaviour is there waiting to cause you a problem.
In SQL Server 2005 we have READ_COMMITTED_SNAPSHOT which gives you the last committed value so you get a consistent result from your query (no multi/missing counts) and writers don’t block readers.
In SQL Server 2000 well, there are no other options except tuning the queries and making sure you aren’t reading lots of pages to get the query result.
Further reading: http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html
Filed under: SQL Server
Note (17th Nov 2006): I've done a more comprehensive entry on the problems with READ COMMITTED isolation and getting missing rows/additional rows there too - http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/16/1345.aspx