17 May 2008 15:38
tonyrogerson
Non Recursive Common Table Expressions - Performance Sucks 1 - CTE Self-Join / CTE Sub Query inline expansion
Unless you are using recursion then the Common Table Expression sucks and you shouldn’t use it; the big problem is that the SQL in the anchor is repeated rather than spooled into a work table.
In the last article I went through the basics of the Common Table Expression and CTE Recursion and hinted that there may be performance problems, in this article I delve a bit deeper – specifically we see how self-join’s on a non-recursive CTE are very bad for performance.
Looking at our example query below we are using the Common Table Expression with a self-join and additionally a sub-query on the select; don’t look too much at what I’m doing in the query because it pretty much doesn’t make any sense – the important aspects to concentrate on are the self-join and sub-query.
Example CTE:
with ctePerf ( objname, colid, colname, max_length )
as (
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from ctePerf p
left outer join ctePerf p2 on p2.objname=p.objname
and p2.colid <=p.colid
group by p.objname, p.colname, p.colid
order by objname, colid
Let’s look at the query plan for the query we use as the CTE anchor:
CTE “anchor” query and plan:
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
|--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))
|--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
Check the query plan after incorporating the query into the CTE (below), before looking at the query plan note we are referencing the CTE anchor 3 times (shown in bold – aliases p, p2 and p3):
Query and Execution Plan:
with ctePerf ( objname, colid, colname, max_length )
as (
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from ctePerf p
left outer join ctePerf p2 on p2.objname=p.objname
and p2.colid <=p.colid
group by p.objname, p.colname, p.colid
order by objname, colid
|--Compute Scalar(DEFINE:([Expr1022]=[Expr1022]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([o ].[name], [c ].[column_id]))
|--Compute Scalar(DEFINE:([Expr1015]=CASE WHEN [Expr1035]=(0) THEN NULL ELSE [Expr1036] END))
| |--Stream Aggregate(GROUP BY:([c ].[column_id], [c ].[name]) DEFINE:([Expr1035]=COUNT_BIG([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]), [Expr1036]=SUM([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]), [o ].[name]=ANY
| |--Nested Loops(Left Outer Join, WHERE:([tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]<=[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))
| |--Sort(ORDER BY:([c ].[column_id] ASC, [c ].[name] ASC))
| | |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
| | |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))
| | |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
| |--Table Spool
| |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
| |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))
| |--Compute Scalar(DEFINE:([c ].[max_length]=[tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]))
| |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
|--Index Spool(SEEK:([o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND [c ].[column_id]=[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))
|--Compute Scalar(DEFINE:([Expr1022]=CASE WHEN [Expr1037]=(0) THEN NULL ELSE [Expr1038] END))
|--Stream Aggregate(DEFINE:([Expr1037]=Count(*), [Expr1038]=SUM([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length])))
|--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND [tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb
|--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]), WHERE:([tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]>[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))
If you compare the above query plan to the first, the specific plan for just the anchor you can see the same ‘anchor’ query has pretty much been expanded inline into three parts of the query plan – take a look back at the query and note there are 3 references to the anchor!
The output below is a subset of the information available when you use SET STATISTICS PROFILE ON, I’ve not bothered posting the full STATS PROFILE instead I’ve truncated the output for readability – I’m more concerned with the Rows and Executes columns rather than the query tree.
Table 'Worktable'. Scan count 11, logical reads 56, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#syscols'. Scan count 7, logical reads 35, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#sysobjs'. Scan count 7, logical reads 21, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Rows Executions
5 1 with ctePerf ( objname, colid, olid
0 0 |--Compute Scalar(DEFINE:([Expr1022]=[Expr1022]))
5 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([o ].[name], [c ].[column_id]))
0 0 |--Compute Scalar(DEFINE:([Expr1015]=CASE WHEN [Expr1035]=(0) THEN NUL
5 1 | |--Stream Aggregate(GROUP BY:([c ].[column_id], [c ].[name]) DEFINE
15 1 | |--Nested Loops(Left Outer Join, WHERE:([tempdb].[dbo].[#sys
5 1 | |--Sort(ORDER BY:([c ].[column_id] ASC, [c ].[name] ASC))
5 1 | | |--Hash Match(Inner Join, HASH:([o ].[object_id])=(
1 1 | | |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ])
453 1 | | |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
25 5 | |--Table Spool
5 1 | |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
1 1 | |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]),
0 0 | |--Compute Scalar(DEFINE:([c ].[max_length]=[tempdb].[dbo].
453 1 | |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
5 5 |--Index Spool(SEEK:([o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND
0 0 |--Compute Scalar(DEFINE:([Expr1022]=CASE WHEN [Expr1037]=(0) THEN NULL ELSE [Expr1038] END))
5 5 |--Stream Aggregate(DEFINE:([Expr1037]=Count(*), [Expr1038]=SUM([tempdb].[dbo].[#syscols]
10 5 |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
5 5 |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs]
1496 5 |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]), WHERE:([tempdb].[dbo].[#syscols]
Notice the number of rows returned for each of the highlighted steps – it’s running the complete query every time; the 1496 is higher because of what we are doing in the sub-query.
Let’s use a temporary table for the anchor instead; first pump the query into a temporary table then use that in the CTE...
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
into #anchor
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
;with ctePerf ( objname, colid, colname, max_length )
as (
select objname, colid, colname, max_length
from #anchor
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from ctePerf p
left outer join ctePerf p2 on p2.objname=p.objname
and p2.colid <=p.colid
group by p.objname, p.colname, p.colid
order by objname, colid
go
drop table #anchor
Execution plans...
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
into #anchor
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#syscols'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#sysobjs'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Rows Executions
5 1 select objname = o.name, colid = c.column_id, colname = c.name,
5 1 |--Table Insert(OBJECT:([#anchor]), SET:([#anchor].[objname] = [tempdb].[d
5 1 |--Top(ROWCOUNT est 0)
5 1 |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
1 1 |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].
453 1 |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
;with ctePerf ( objname, colid, colname, max_length )
as (
select objname, colid, colname, max_length
from #anchor
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from ctePerf p
left outer join ctePerf p2 on p2.objname=p.objname
and p2.colid <=p.colid
group by p.objname, p.colname, p.colid
order by objname, colid
Table '#anchor'. Scan count 7, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Rows Executions
5 1 with ctePerf ( objname, colid, colname, max_length
0 0 |--Compute Scalar(DEFINE:([Expr1016]=[Expr1016]))
5 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[#a
0 0 |--Compute Scalar(DEFINE:([Expr1011]=CASE WHEN [Expr1029]=(
5 1 | |--Stream Aggregate(GROUP BY:([tempdb].[dbo].[#anchor]
15 1 | |--Nested Loops(Left Outer Join, OUTER REFERENCES
5 1 | |--Sort(ORDER BY:([tempdb].[dbo].[#anchor].[
5 1 | | |--Table Scan(OBJECT:([tempdb].[dbo].[#
0 0 | |--Compute Scalar(DEFINE:([tempdb].[dbo].[#a
15 5 | |--Table Scan(OBJECT:([tempdb].[dbo].[#
0 0 |--Compute Scalar(DEFINE:([Expr1016]=CASE WHEN [Expr1031]=(
5 5 |--Stream Aggregate(DEFINE:([Expr1031]=Count(*), [Expr
10 5 |--Table Scan(OBJECT:([tempdb].[dbo].[#anchor]),
Comparing the results from the above you can clearly see that the IO is a lot less (19 logical reads using the # table against 112 just using the CTE) – that is also shown by the Rows column in the STATISTICS PROFILE output.
But looking at an example on more realistic volumes you can clearly see the difference. The example below is on a transaction table and gives running totals for a subset of accounts.
with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id )
as (
select row_no = id, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id
from Transactions
where account_id between 1000 and 1020
)
select t1.*,
running_amount = coalesce(
( select sum( amount )