This blog will no longer be updated.

New content is available on my new blog

January 2010 - Posts - Piotr Rodak

January 2010 - Posts

relative query cost not accurate when using set rowcount

When tuning performance of a query, it is quite common to compare different variants of the query in the same batch and compare the execution plans to see if changes that you made actually work for better or not. Often you would compare just relative cost of execution plan of a query within the batch. I lived for a long time with innocent assumption that the relative cost is accurate and reliable. As Gail Shaw showed in her blog, this assumption doesn't hold if you use scalar functions within a query. I just found out that this is true also for certain type of queries that you would use in specific scenarios.


Let's assume that you have a huge table that is heavily used by OLTP system. Yet you have to delete significant amount of data from it with as small impact for the user as possible. If you just execute delete from table with some key value, it may happen that the amount of affected rows will trigger lock escalation to table level. This will obviously prevent users from inserting or even reading data from the table.
The remedy for such scenarion is to delete rows in batches.
Let's see how we can do it. Here is the table we are going to use.

create table t1(a int identity(1, 1) primary key clustered, b char(10) default('a'))


I will populate it with 100 000 rows:

set rowcount 100000

insert t1 (b)

select 'a'

from master.sys.objects a, master.sys.objects b, master.sys.objects c

set rowcount 0

select count(*) [before batch 1] from t1



Fist approach to delete rows in batches from this table can be as follows:

--batch 1

declare @batchsize int

set @batchsize = 100

set rowcount @batchsize

while 1=1

begin

delete t1 where a < 50000

if @@rowcount = 0

break;

end

set rowcount 0



As you see, we can modify the size of the batch here to adjust it to the size of the table and required degree of concurrency.
The other option is to use rowcount setting, similar to the approach we used to populate the table:

declare @batchsize int

set @batchsize = 100

while @batchsize <> 0

begin

delete top (@batchsize) from t1 where a < 50000

set @batchsize = @@rowcount

end



The query and the loop looks a bit simpler. Which approach is better?  

To be honest both techniques give similar results, at least for the sample data. I didn't have time to run tests for bigger amounts of data, but if you find this interesting, I attach the script so you can check it out for yourself.

More interesting are the execution plans. I removed while loops from above queries and executed them with "Include Actual Execution Plan" option on.

declare @batchsize int

set @batchsize = 100


--query 1

set rowcount @batchsize

delete t1 where a < 50000

set rowcount 0

--query 2

delete top (@batchsize) from t1 where a < 50000


The set rowcount query shows plainly delete from the clustered index. The select top() query shows plan that is more complicated, involving clustered index seek and Top operator. Yet SQL Server shows that the plan with select top() is much more efficient!


 

Why? I checked IO statistics and both queries show the same number of reads:

Table 't1'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 't1'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If you look though at the estimated and actual numbers of rows that optimizer processed, you will see immediate difference

 

 

 


The optimizer doesn't know that set rowcount @batchsize is in effect and estimates that all rows will have to be processed. The select top() query is parsed by optimizer and it knows that we want only 100 rows, so it estimates that the cost of IO and memory will be much less than in case of set rowcount query. Yet actual numbers of rows are the same, which may explain why my tests show no particular difference between both approaches.

This leads me to conclusion that the relative difference between both plans, as you see on the first picture, is mostly caused by inaccurate estimation of amount of data the engine will have to read and modify. And if this is the case, I think this is worth to keep in mind that it may be that stale statistics may affect your plan comparisons too. You can recognize stale statistics amongst the others by significant differences between estimated and actual numbers of rows in execution plans.

I attach the scripts that I used for tests here: testbatchdelete.zip

like in the old times

My little brother got recently an old 50mm Pentax-M 1:2 lens. While this lens is not known for the best image quality amongst Pentax prime 50mm lenses, it gave me first opportunity to take pictures 'the old way' on my K10D. This is quite nostalgic feeling to have to focus and set aperture manually like in the old '80s with famous soviet camera Zenith. My brother has also some old lenses with M42 mount. I bought adapter allowing me to connect them to my camera and indeed, it works. I find though that 50mm is the most interesting.

This is one of pictures i took with it. It is quite hard to get focus properly, at least for me, but when I succeed, the satisfaction is much more rewarding than when I use 'intelligent' lens.

By the way, winter in Poland looks beautiful, provided you have heating (it can be -20 C here or less) :).

 

Posted by Piotr Rodak | with no comments
Filed under:

updating column with a random value

I have been working recently on load testing of our ETL. One of the factors that determines amount of transfered data is a lookup table that is joined to the source. I populated this table with aproppriate values, and then realized that I also need some values in second column, and they were null. Let's say the table looks like that:

create table t101

(

a int,

b char(10)

)

go

insert t101(a, b)

select 1, ''

union all select 2, ''

union all select 3, ''

union all select 4, ''

union all select 5, ''

union all select 6, ''

union all select 7, ''

union all select 8, ''

go

Luckily I had another table that contained the values I needed. The problem was only, there were much less rows in the second table than in the first one. There is also no relation between both tables.

 

create table t102

(

someval char(10)

)

 

go

insert t102(someval)

select 'val 01'

union all select 'val 02'

union all select 'val 03'

union all select 'val 04'

union all select 'val 05'

union all select 'val 06'

union all select 'val 07'

union all select 'val 08'

union all select 'val 09'

go

Happily I wrote a simple update:

update t101 set b = (select top 1 someval from t102 order by newid())

I ran the query and then selected from the t101:

select * from t101

And the results are..

a    b
1    val 06   
2    val 06   
3    val 06   
4    val 06   
5    val 06   
6    val 06   
7    val 06   
8    val 06     

I was surprised, to say the least. Since I had something other to do, I left this as it was, letting my 'background threads' do the thinking. Of course, I figured out what was going on when I was walking home, wading in snow slush and struggling with wind.

The problem with above subquery is that it is not correlated. This means, its execution doesn't depend on the row the query produces output for. So the optimizer chooses to execute the query only once and apply the result to every row of the outer query. It is a random result, but only one for the whole set.

Ok, so what to do then? No, don't use cursor.

declare c cursor for select a from t101
declare @a int
open c
fetch next from c into @a
while @@fetch_status = 0
begin
        update t101 set b = (select top 1 someval from t102 order by newid())
        where a = @a
        fetch next from c into @a
end
close c
deallocate c

Seriously, don't use cursor.

You have to find a way to correlate the query even if there is no relationship between the data. I thought about two ways, but I am sure there may be more.

First option is to return value which would be a result of concatenation of value from table 2 and value from current row of table 1.:

update t101 set b = (select top 1 someval + case when b is null then '' else '' end from t102 order by newid())

As you see, I don't add effectively anything to someval. This query works, but is kind of too elaborate and complicated to write. So there is another option for you - add a filter that will always return true.

update t101 set b = (select top 1 someval from t102 where a = a order by newid())

 

As you see, as long as a value is not null, the subquery will return a result. And it will be a new result for each processed row.

I attach the code so you can play with this idea a bit more, maybe you will find simpler ways of generating random data than these above - if you do, please let me know.

 Testing update random value.zip

 

 

 

 

updlock, holdlock and deadlocks

We have a process that manages sliding window over a number of tables in our reporting database. This process is triggered by first call to our reporting ETL. There is table that contains two dates that are important for the partitioning and business requirements. The table is build as the following example:

if object_id('dbo.tDateTest') is not null

drop table dbo.tDateTest

go

create table dbo.tDateTest

(

FromDate datetime not null,

ToDate datetime not null

)


The table contains only one row:

insert dbo.tDateTest(FromDate, ToDate) values('20090102', '20090101')


 
It is important that the code is called only once and other processes do not overlap, as this would lead to incorrect partition range and missing data in the interface views.

Just a few days we had a production issue, and then another one, next day - deadlocks. That surprised me to say the least, because I thought I had secured myself against this problem. The code looked like this:

declare @newFrom datetime, @newTo datetime, @oldFrom datetime

set @newFrom = '20090115'

set @newTo = '20090114'


begin tran

--######### original approach

select @oldFrom = FromDate from dbo.tDateTest with(updlock, holdlock)


while @oldFrom < @newFrom

begin

set @oldFrom = dateadd(day, 1, @oldFrom)

update dbo.tDateTest set FromDate = @newFrom, ToDate = @newTo

--some other processing here..

end

commit

select * from dbo.tDateTest


Query 1, Original approach.

As you see I added the updlock and holdlock hints to the select statement. This technique effectively changes isolation level of the transaction to serializable, by acquiring update lock at the time select statement is executed and holding it to the end of the transaction.
Well, it turned out that when multiple processes execute the above code, there may be a situation when a deadlock occurs.
At the beginning, I had trouble to actually reproduce the behavior. Fiddling with waitfor statement and multiple windows proved to be rather cumbersome. Then I remembered that Adam Machanic's little utility SQLQueryStress can do exactly what I want - call some code many times from different threads. And indeed, when I set up the utility and pasted above code (picture below), I encountered deadlocks exactly as in the production.



I turned on trace flag 1222 which writes detailed deadlock information to the error log to see what actually is going on.
dbcc traceon(1222, -1)
The -1 parameter indicates that you want to set the status flag globally for all connections to the server. If it is not provided, the flag applies only to the current connection.

After I had enabled the flag, I pasted query 1 to the SQLQueryStress util. I configured the tool to run the code 40 times in two threads. During the run errors were indicated and when I clicked the ellipsis (...) button I could verify the errors were indeed deadlocks. So I opened the SQL Server error log and got the following output from it:

deadlock-list
deadlock victim=process1cdf68c58
process-list
process id=process79a72e8 taskpriority=0 logused=0 waitresource=OBJECT: 15:1622453004:0  waittime=1812 ownerId=4097956687 transactionname=user_transaction lasttranstarted=2009-12-21T13:56:52.703 XDES=0x8031d0c0 lockMode=X schedulerid=1 kpid=1856 status=suspended spid=117 sbid=0 ecid=0 priority=0 transcount=1 lastbatchstarted=2009-12-21T13:56:52.703 lastbatchcompleted=2009-12-21T13:56:52.563 clientapp=.Net SqlClient Data Provider hostname=MYHOST hostpid=16244 loginname=testlogin isolationlevel=read committed (2) xactid=4097956687 currentdb=15 lockTimeout=4294967295 clientoption1=671096864 clientoption2=128056
executionStack
frame procname=adhoc line=7 stmtstart=336 stmtend=480 sqlhandle=0x020000000144d3314c9019b950c94aa9630febde7d049222
select @oldFrom = FromDate from dbo.tDateTest with(updlock<c/> holdlock)
inputbuf
declare @newFrom datetime<c/> @newTo datetime<c/> @oldFrom datetime
set @newFrom = '20090115'
set @newTo = '20090114'
begin tran
--#########        original approach
select @oldFrom = FromDate from dbo.tDateTest with(updlock<c/> holdlock)
while @oldFrom < @newFrom
begin
set @oldFrom = dateadd(day<c/> 1<c/> @oldFrom)
update dbo.tDateTest set FromDate = @newFrom<c/> ToDate = @newTo
#NAME?
end
commit
select * from dbo.tDateTest
process id=process1cdf68c58 taskpriority=0 logused=0 waitresource=OBJECT: 15:1622453004:0  waittime=1828 ownerId=4097956685 transactionname=user_transaction lasttranstarted=2009-12-21T13:56:52.703 XDES=0x35527f6d0 lockMode=X schedulerid=3 kpid=3324 status=suspended spid=118 sbid=0 ecid=0 priority=0 transcount=1 lastbatchstarted=2009-12-21T13:56:52.703 lastbatchcompleted=2009-12-21T13:56:52.533 clientapp=.Net SqlClient Data Provider hostname=MYHOST hostpid=16244 loginname=testlogin isolationlevel=read committed (2) xactid=4097956685 currentdb=15 lockTimeout=4294967295 clientoption1=671096864 clientoption2=128056
executionStack
frame procname=adhoc line=7 stmtstart=336 stmtend=480 sqlhandle=0x020000000144d3314c9019b950c94aa9630febde7d049222
select @oldFrom = FromDate from dbo.tDateTest with(updlock<c/> holdlock)
inputbuf
declare @newFrom datetime<c/> @newTo datetime<c/> @oldFrom datetime
set @newFrom = '20090115'
set @newTo = '20090114'
begin tran
--#########        original approach
select @oldFrom = FromDate from dbo.tDateTest with(updlock<c/> holdlock)
while @oldFrom < @newFrom
begin
set @oldFrom = dateadd(day<c/> 1<c/> @oldFrom)
update dbo.tDateTest set FromDate = @newFrom<c/> ToDate = @newTo
#NAME?
end
commit
select * from dbo.tDateTest
resource-list
objectlock lockPartition=0 objid=1622453004 subresource=FULL dbid=15 objectname=Database_Perf.dbo.tDateTest id=lockea708480 mode=IX associatedObjectId=1622453004
owner-list
owner id=process79a72e8 mode=IX
owner id=process1cdf68c58 mode=IX
waiter-list
waiter id=process1cdf68c58 mode=X requestType=convert
waiter id=process79a72e8 mode=X requestType=convert



There a couple of facts to be noticed here. As you see in the second line od the output, process process1cdf68c58 has been chosen as victim of the deadlock. The lines beginning with >process id< contain also information about what object was deadlocked. This is the waitresource part: =OBJECT: 15:1622453004:0. The identifier tells you that the deadlocked resource is an object (table), in database with db_id() 15, then there is id of the object within this database and eventually id of the index on the table. The index id 0 indicates that this is heap - either there there is no clustered index defined or it was not chosen to run the query.
By the end of the output in the resource-list section you can also see the name of the table and the types of locks that were in place before the deadlock occured. As you see, both processes (owner id lines) had intent exclusive (IX) lock on the table and both of them decided to convert the lockx to exclusive (X)) and since these locks (IX and X) are not compatible, deadlock occured and one of the processes was killed. There is a MSDN article here with overview of the locks compatibility.

To say the least, I was pretty surprised. I never thought this would occur, although when I think of this, the deadlocking is pretty reasonable. I just made (uneducated) assumption, that when there is only one row in a table, the enginge will get straight to this row and (updlock, holdlock) will work as planned. I was wrong as it turns out.

Anyway, I needed to fix the problem.

1. The first idea was to replace the updlock with tablockx and this resolved the deadlocking issue.
Note that the TABLOCKX hint causes the engine to take an exclusive lock on the whole table. Use it with caution, as if you have really concurrent environment, this hint can affect performance seriously. In my case, this is not an issue, so I am OK with this approach. Here is the query that I used to resolve the problem:

declare @newFrom datetime, @newTo datetime, @oldFrom datetime

set @newFrom = '20090115'

set @newTo = '20090114'


begin tran

--############ tablockx approach

select @oldFrom = FromDate from dbo.tDateTest with(tablockx, holdlock)


while @oldFrom < @newFrom

begin

set @oldFrom = dateadd(day, 1, @oldFrom)

update dbo.tDateTest set FromDate = @newFrom, ToDate = @newTo

--some other processing here..

end

commit

select * from dbo.tDateTest



This would indicate that updlock has different path of execution than tablockx and there is transition state from intent lock to exclusive lock that is prone to deadlock issues if exclusive table lock is not explicitly requested. Please note that while I accepted this approach as the solution, it may not be the best idea if you have heavily accessed table with many rows.


Since I was at it, I decided to try several various approaches and see if they work. Here are some approaches I tried:
2. I thought that maybe deadlocking is caused by query without where clause narrowing the set to one row (yeah, there is one row there anyway, but maybe optimizer chooses different optimization technique?). I changed the select statement to the following:

select @oldFrom = FromDate from dbo.tDateTest with(updlock, holdlock)

where FromDate = '20090102'



Obviously, because there is no index on the table, the select statement produces table scan in execution plan, so there is not a big change to the plan produced by query without where clause. The deadlocks are still there, too.

3. Next idea I had was to create a nonclustered index on the table and use it to select rows from it.

--add nonclustered index

if object_id('dbo.tDateTest') is not null

drop table dbo.tDateTest

go

create table dbo.tDateTest

(

FromDate datetime not null,

ToDate datetime not null

)

go

create index ix1 on dbo.tDateTest(FromDate)

go

insert dbo.tDateTest(FromDate, ToDate) values('20090102', '20090101')

go



Now, this query

select @oldFrom = FromDate from dbo.tDateTest with(updlock, holdlock)



uses index scan because optimizer chooses to use index instead of table to retrieve all required data. What is interesting - there are no deadlocks here. So it looks like index scan is somewhat safer in this case than table scan, in terms of deadlocking.

4. I added new variable to the script and changed the query so it has to use in some way a table.
The query looks now like this:

select @oldFrom = FromDate, @oldTo = ToDate from dbo.tDateTest with(updlock, holdlock)


Because there is no where clause, the optimizer chose to use table scan and bypass the index altogether. This takes us to the square one and the deadlocking issue.

5. I wondered what would happen if I enforced use of the index in the query:

select @oldFrom = FromDate, @oldTo = ToDate from dbo.tDateTest with(updlock, holdlock, index=ix1)



This query produces plan that contains index scan and RID lookup on the table. And there are no deadlocks interestingly.

6. So, next approach is to use index and search for the only row in the table using WHERE clause.

select @oldFrom = FromDate, @oldTo = ToDate from dbo.tDateTest with(updlock, holdlock, index=ix1) where FromDate = '20090102'



Since the optimizer goes directly to the searched row, the locking seems to be less prone to deadlocking.

7. The last change I checked was to modify the table and add surrogate clustered primary key:

--change definition of the table - add clustered key

if object_id('dbo.tDateTest') is not null

drop table dbo.tDateTest

go

create table dbo.tDateTest

(

PK int identity(1, 1) primary key clustered,

FromDate datetime not null,

ToDate datetime not null

)

go

insert dbo.tDateTest(FromDate, ToDate) values('20090102', '20090101')


go



Now, the query was modified to seek for the row with PK = 1,

select @oldFrom = FromDate from dbo.tDateTest with(updlock, holdlock)

where PK = 1



Since this is a Clustered Index seek in execution plan, there is no deadlocking as well.


In summary, this is another example that it is better to have a clustered index than not, even on small single-row tables. It is surprising that table-scan is more susceptible to deadlocks than index scan as idea number 5 showed.
I suppose these findings may be inaccurate if there are more rows in the table or in the index. I haven't tested such scenario yet, but definitely SQLStressQuery utility is coming to my toolbelt.

It is worth to mention, that there is another way of serializing execution of your code. There is a, somewhat not well known procedure sp_getapplock which, together with sp_releaseapplock allows for creating critical section around your code withouth applying locks to the table. I checked this approach as well and obviously it prevents deadlocks from occur. This is pretty interesting option in certain situations in my opinion.


The code I used for testing is available here: testing updlock, holdlock hint.zip, and the SQLStressQuery utility is here, please have your go if you find it interesting and have some time to spend.