Unless you are using recursion then the Common Table Expression sucks and you shouldn’t use it; the big problem is that the SQL in the anchor is repeated rather than spooled into a work table.
In the last article I went through the basics of the Common Table Expression and CTE Recursion and hinted that there may be performance problems, in this article I delve a bit deeper – specifically we see how self-join’s on a non-recursive CTE are very bad for performance.
Looking at our example query below we are using the Common Table Expression with a self-join and additionally a sub-query on the select; don’t look too much at what I’m doing in the query because it pretty much doesn’t make any sense – the important aspects to concentrate on are the self-join and sub-query.
Example CTE:
with ctePerf ( objname, colid, colname, max_length )
as (
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from ctePerf p
left outer join ctePerf p2 on p2.objname=p.objname
and p2.colid <=p.colid
group by p.objname, p.colname, p.colid
order by objname, colid
Let’s look at the query plan for the query we use as the CTE anchor:
CTE “anchor” query and plan:
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
|--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))
|--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))
Check the query plan after incorporating the query into the CTE (below), before looking at the query plan note we are referencing the CTE anchor 3 times (shown in bold – aliases p, p2 and p3):
Query and Execution Plan:
with ctePerf ( objname, colid, colname, max_length )
as (
select objname = o.name,
colid = c.column_id,
colname = c.name,
c.max_length
from #sysobjs o
inner join #syscols c on c.object_id=o.object_id
where o.type='U'
and o.name= 'accounts'
)
select p.objname, p.colname, p.colid,
size_sofar = SUM( p2.max_length ),
size_left = ( SELECT SUM( p3.max_length )
FROM ctePerf p3
WHERE p3.objname=p.objname
and p3.colid > p.colid
)
from