17 May 2008 15:38 tonyrogerson

Non Recursive Common Table Expressions - Performance Sucks 1 - CTE Self-Join / CTE Sub Query inline expansion

Unless you are using recursion then the Common Table Expression sucks and you shouldn’t use it; the big problem is that the SQL in the anchor is repeated rather than spooled into a work table.

In the last article I went through the basics of the Common Table Expression and CTE Recursion and hinted that there may be performance problems, in this article I delve a bit deeper – specifically we see how self-join’s on a non-recursive CTE are very bad for performance.

Looking at our example query below we are using the Common Table Expression with a self-join and additionally a sub-query on the select; don’t look too much at what I’m doing in the query because it pretty much doesn’t make any sense – the important aspects to concentrate on are the self-join and sub-query.

Example CTE:

with ctePerf ( objname, colid, colname, max_length )

as (

 

      select objname = o.name,

               colid   = c.column_id,

               colname = c.name,

               c.max_length

      from #sysobjs o

            inner join #syscols c on c.object_id=o.object_id

      where o.type='U'

        and o.name= 'accounts'

  )

select p.objname, p.colname, p.colid,

         size_sofar = SUM( p2.max_length ),

         size_left  = ( SELECT SUM( p3.max_length )

                        FROM ctePerf p3

                        WHERE p3.objname=p.objname

                          and p3.colid > p.colid

                                    )

from ctePerf p

      left outer join ctePerf p2 on p2.objname=p.objname

                                            and p2.colid <=p.colid

group by p.objname, p.colname, p.colid

order by objname, colid

Let’s look at the query plan for the query we use as the CTE anchor:

CTE “anchor” query and plan:

select objname = o.name,

       colid   = c.column_id,

       colname = c.name,

       c.max_length

from #sysobjs o

      inner join #syscols c on c.object_id=o.object_id

where o.type='U'

  and o.name= 'accounts'

  |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

       |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))

       |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

Check the query plan after incorporating the query into the CTE (below), before looking at the query plan note we are referencing the CTE anchor 3 times (shown in bold – aliases p, p2 and p3):

Query and Execution Plan:

with ctePerf ( objname, colid, colname, max_length )

as (

 

      select objname = o.name,

               colid   = c.column_id,

               colname = c.name,

               c.max_length

      from #sysobjs o

            inner join #syscols c on c.object_id=o.object_id

      where o.type='U'

        and o.name= 'accounts'

  )

select p.objname, p.colname, p.colid,

         size_sofar = SUM( p2.max_length ),

         size_left  = ( SELECT SUM( p3.max_length )

                        FROM ctePerf p3

                        WHERE p3.objname=p.objname

                          and p3.colid > p.colid

                                    )

from ctePerf p

      left outer join ctePerf p2 on p2.objname=p.objname

                                            and p2.colid <=p.colid

group by p.objname, p.colname, p.colid

order by objname, colid

  |--Compute Scalar(DEFINE:([Expr1022]=[Expr1022]))

       |--Nested Loops(Inner Join, OUTER REFERENCES:([o ].[name], [c ].[column_id]))

            |--Compute Scalar(DEFINE:([Expr1015]=CASE WHEN [Expr1035]=(0) THEN NULL ELSE [Expr1036] END))

            |    |--Stream Aggregate(GROUP BY:([c ].[column_id], [c ].[name]) DEFINE:([Expr1035]=COUNT_BIG([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]), [Expr1036]=SUM([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]), [o ].[name]=ANY

            |         |--Nested Loops(Left Outer Join, WHERE:([tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]<=[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))

            |              |--Sort(ORDER BY:([c ].[column_id] ASC, [c ].[name] ASC))

            |              |    |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

            |              |         |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))

            |              |         |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

            |              |--Table Spool

            |                   |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

            |                        |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=N'accounts'))

            |                        |--Compute Scalar(DEFINE:([c ].[max_length]=[tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length]))

            |                             |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

            |--Index Spool(SEEK:([o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND [c ].[column_id]=[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))

                 |--Compute Scalar(DEFINE:([Expr1022]=CASE WHEN [Expr1037]=(0) THEN NULL ELSE [Expr1038] END))

                      |--Stream Aggregate(DEFINE:([Expr1037]=Count(*), [Expr1038]=SUM([tempdb].[dbo].[#syscols].[max_length] as [c ].[max_length])))

                           |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

                                |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs].[name] as [o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND [tempdb].[dbo].[#sysobjs].[type] as [o ].[type]='U' AND [tempdb

                                |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]), WHERE:([tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]>[tempdb].[dbo].[#syscols].[column_id] as [c ].[column_id]))


If you compare the above query plan to the first, the specific plan for just the anchor you can see the same ‘anchor’ query has pretty much been expanded inline into three parts of the query plan – take a look back at the query and note there are 3 references to the anchor!

The output below is a subset of the information available when you use SET STATISTICS PROFILE ON, I’ve not bothered posting the full STATS PROFILE instead I’ve truncated the output for readability – I’m more concerned with the Rows and Executes columns rather than the query tree.

Table 'Worktable'. Scan count 11, logical reads 56, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#syscols'. Scan count 7, logical reads 35, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#sysobjs'. Scan count 7, logical reads 21, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Rows  Executions

5      1      with ctePerf ( objname, colid, olid

0      0        |--Compute Scalar(DEFINE:([Expr1022]=[Expr1022]))

5      1             |--Nested Loops(Inner Join, OUTER REFERENCES:([o ].[name], [c ].[column_id]))

0      0                  |--Compute Scalar(DEFINE:([Expr1015]=CASE WHEN [Expr1035]=(0) THEN NUL

5      1                  |    |--Stream Aggregate(GROUP BY:([c ].[column_id], [c ].[name]) DEFINE

15     1                  |         |--Nested Loops(Left Outer Join, WHERE:([tempdb].[dbo].[#sys

5      1                  |              |--Sort(ORDER BY:([c ].[column_id] ASC, [c ].[name] ASC))

5      1                  |              |    |--Hash Match(Inner Join, HASH:([o ].[object_id])=(

1      1                  |              |         |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ])

453    1                  |              |         |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

25     5                  |              |--Table Spool

5      1                  |                   |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

1      1                  |                        |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]),

0      0                  |                        |--Compute Scalar(DEFINE:([c ].[max_length]=[tempdb].[dbo].

453    1                  |                             |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

5      5                  |--Index Spool(SEEK:([o ].[name]=[tempdb].[dbo].[#sysobjs].[name] as [o ].[name] AND

0      0                       |--Compute Scalar(DEFINE:([Expr1022]=CASE WHEN [Expr1037]=(0) THEN NULL ELSE [Expr1038] END))

5      5                            |--Stream Aggregate(DEFINE:([Expr1037]=Count(*), [Expr1038]=SUM([tempdb].[dbo].[#syscols]

10     5                                 |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

5      5                                      |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].[dbo].[#sysobjs]

1496   5                                      |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]), WHERE:([tempdb].[dbo].[#syscols]

Notice the number of rows returned for each of the highlighted steps – it’s running the complete query every time; the 1496 is higher because of what we are doing in the sub-query.

Let’s use a temporary table for the anchor instead; first pump the query into a temporary table then use that in the CTE...

select objname = o.name,

         colid   = c.column_id,

         colname = c.name,

         c.max_length

into #anchor

from #sysobjs o

      inner join #syscols c on c.object_id=o.object_id

where o.type='U'

  and o.name= 'accounts'

 

;with ctePerf ( objname, colid, colname, max_length )

as (

      select objname, colid, colname, max_length

      from #anchor

  )

select p.objname, p.colname, p.colid,

         size_sofar = SUM( p2.max_length ),

         size_left  = ( SELECT SUM( p3.max_length )

                                FROM ctePerf p3

                                WHERE p3.objname=p.objname

                                    and p3.colid > p.colid

                                    )

from ctePerf p

      left outer join ctePerf p2 on p2.objname=p.objname

                                            and p2.colid <=p.colid

group by p.objname, p.colname, p.colid

order by objname, colid

go

 

 

drop table #anchor

 

Execution plans...

select objname = o.name,

         colid   = c.column_id,

         colname = c.name,

         c.max_length

into #anchor

from #sysobjs o

      inner join #syscols c on c.object_id=o.object_id

where o.type='U'

  and o.name= 'accounts'

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#syscols'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#sysobjs'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

Rows  Executions

5      1      select objname = o.name,       colid   = c.column_id,      colname = c.name,

5      1        |--Table Insert(OBJECT:([#anchor]), SET:([#anchor].[objname] = [tempdb].[d

5      1             |--Top(ROWCOUNT est 0)

5      1                  |--Hash Match(Inner Join, HASH:([o ].[object_id])=([c ].[object_id]))

1      1                       |--Table Scan(OBJECT:([tempdb].[dbo].[#sysobjs] AS [o ]), WHERE:([tempdb].

453    1                       |--Table Scan(OBJECT:([tempdb].[dbo].[#syscols] AS [c ]))

 

;with ctePerf ( objname, colid, colname, max_length )

as (

      select objname, colid, colname, max_length

      from #anchor

  )

select p.objname, p.colname, p.colid,

         size_sofar = SUM( p2.max_length ),

         size_left  = ( SELECT SUM( p3.max_length )

                                FROM ctePerf p3

                                WHERE p3.objname=p.objname

                                    and p3.colid > p.colid

                                    )

from ctePerf p

      left outer join ctePerf p2 on p2.objname=p.objname

                                            and p2.colid <=p.colid

group by p.objname, p.colname, p.colid

order by objname, colid

 

Table '#anchor'. Scan count 7, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


Rows  Executions

5     1       with ctePerf ( objname, colid, colname, max_length

0     0         |--Compute Scalar(DEFINE:([Expr1016]=[Expr1016]))

5     1              |--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[#a

0     0                   |--Compute Scalar(DEFINE:([Expr1011]=CASE WHEN [Expr1029]=(

5     1                   |    |--Stream Aggregate(GROUP BY:([tempdb].[dbo].[#anchor]

15    1                   |         |--Nested Loops(Left Outer Join, OUTER REFERENCES

5     1                   |              |--Sort(ORDER BY:([tempdb].[dbo].[#anchor].[

5     1                   |              |    |--Table Scan(OBJECT:([tempdb].[dbo].[#

0     0                   |              |--Compute Scalar(DEFINE:([tempdb].[dbo].[#a

15    5                   |                   |--Table Scan(OBJECT:([tempdb].[dbo].[#

0     0                   |--Compute Scalar(DEFINE:([Expr1016]=CASE WHEN [Expr1031]=(

5     5                        |--Stream Aggregate(DEFINE:([Expr1031]=Count(*), [Expr

10    5                             |--Table Scan(OBJECT:([tempdb].[dbo].[#anchor]),    

Comparing the results from the above you can clearly see that the IO is a lot less (19 logical reads using the # table against 112 just using the CTE) – that is also shown by the Rows column in the STATISTICS PROFILE output.

But looking at an example on more realistic volumes you can clearly see the difference. The example below is on a transaction table and gives running totals for a subset of accounts.

with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id )

as (

    select row_no = id, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id

    from Transactions

    where account_id between 1000 and 1020

    )

select t1.*,

       running_amount = coalesce(

                         (  select sum( amount )

                            from Trans t2

                            where t2.row_no < t1.row_no

                              and t2.account_id = t1.account_id ), 0 )

                        + amount

from Trans t1

order by account_id, row_no


Table 'transactions'. Scan count 1156, logical reads 54084, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

Rows  Executions

588    1      with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount, transaction_types_i

0      0        |--Compute Scalar(DEFINE:([Expr1013]=CASE WHEN [Expr1006] IS NOT NULL THEN [Expr1011] ELS

588    1             |--Nested Loops(Inner Join, PASSTHRU:(IsFalseOrNull [Expr1006] IS NOT NULL), OUTER R

588    1                  |--Nested Loops(Inner Join, OUTER REFERENCES:([SQLBits20080301].[dbo].[transact

588    1                  |    |--Nested Loops(Inner Join, OUTER REFERENCES:([SQLBits20080301].[dbo].[tra

588    1                  |    |    |--Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions].[ncidx_a

588    588                |    |    |--Clustered Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions

0      0                  |    |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1027]=(0) THEN NULL EL

588    588                |         |--Stream Aggregate(DEFINE:([Expr1027]=Count(*), [Expr1028]=SUM([SQLB

8132   588                |              |--Nested Loops(Inner Join, OUTER REFERENCES:([SQLBits20080301].

8132   588                |                   |--Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions

8132   8132               |                   |--Clustered Index Seek(OBJECT:([SQLBits20080301].[dbo].[tr

0      0                  |--Compute Scalar(DEFINE:([Expr1011]=CASE WHEN [Expr1029]=(0) THEN NULL ELSE [E

567    567                     |--Stream Aggregate(DEFINE:([Expr1029]=Count(*), [Expr1030]=SUM([SQLBits20

8132   567                          |--Nested Loops(Inner Join, OUTER REFERENCES:([SQLBits20080301].[dbo]

8132   567                               |--Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions].[nc

8132   8132                              |--Clustered Index Seek(OBJECT:([SQLBits20080301].[dbo].[transact

Same query just the anchor is sourced from a temporary table...

select row_no = id, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id

into #t

from Transactions

where account_id between 1000 and 1020;

 

create unique clustered index clidx on #t ( account_id, row_no )

Table 'transactions'. Scan count 1, logical reads 1813, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

(588 row(s) affected)

 

Table '#t'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Rows  Executions

588    1      select row_no = id, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id

588    1        |--Table Insert(OBJECT:([#t]), SET:([#t].[row_no] = [SQLBits20080301].[dbo].[transactions].[id],[#t].

588    1             |--Top(ROWCOUNT est 0)

588    1                  |--Nested Loops(Inner Join, OUTER REFERENCES:([SQLBits20080301].[dbo].[transactions].[id], [Exp

588    1                       |--Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions].[ncidx_account_id]), SEEK:([S

588    588                     |--Clustered Index Seek(OBJECT:([SQLBits20080301].[dbo].[transactions].[pk_transactions]),

;with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_id )

as (

    select *

    from #t

    )

select t1.*,

       running_amount = coalesce(

                         (  select sum( amount )

                            from Trans t2

                            where t2.row_no < t1.row_no

                              and t2.account_id = t1.account_id ), 0 )

                        + amount

from Trans t1

order by account_id, row_no

go


Table '#t'. Scan count 1156, logical reads 2400, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


Rows  Executions

588    1      with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount, transaction_types_id, account_

0      0        |--Compute Scalar(DEFINE:([Expr1013]=CASE WHEN [Expr1006] IS NOT NULL THEN [Expr1011] ELSE (0.00) EN

588    1             |--Nested Loops(Inner Join, PASSTHRU:(IsFalseOrNull [Expr1006] IS NOT NULL), OUTER REFERENCES:(

588    1                  |--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[#t].[row_no], [tempdb].[dbo]

588    1                  |    |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#t]), ORDERED FORWARD)

0      0                  |    |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1026]=(0) THEN NULL ELSE [Expr1027] END))

588    588                |         |--Stream Aggregate(DEFINE:([Expr1026]=Count(*), [Expr1027]=SUM([tempdb].[dbo].[

8132   588                |              |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[#t]), SEEK:([tempdb].[dbo].

0      0                  |--Compute Scalar(DEFINE:([Expr1011]=CASE WHEN [Expr1028]=(0) THEN NULL ELSE [Expr1029] END))

567    567                     |--Stream Aggregate(DEFINE:([Expr1028]=Count(*), [Expr1029]=SUM([tempdb].[dbo].[#t].[

8132   567                          |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[#t]), SEEK:([tempdb].[dbo].[#t].

As you can see the pure CTE method costs 54,084 logical reads whereas using the # table method costs just 4,218 and look at the differences in the Rows and Executions between the CTE and the # with CTE method, I do actually use the ROW_NUMBER function in my real query example but that highlights yet another performance problem but I’ll save that for another post.

There is another problem here as well that will work against you in the performance stakes – statistics! The CTE is expanding the anchor SQL into the other bits of the query like an inline macro and as such SQL Server doesn’t have the statistics on the subset of rows the anchor produces. By using the temporary you’ve sucked the results out and have the ability to create an index(s) on that # table which the optimiser will then use when it recompiles the CTE statement.

I think my only advice here is – unless your application has a high concurrent number of users and you are suffering COMPILE locks because of recompiles then don’t use Common Table Expressions unless you need recursion; the problem described above does not exist on the CTE when you use recursion because of the way recursion works. If you are simply using the # table in the anchor as I describe above then don’t use the CTE at all – just use the # table in the first place!

 

Filed under: , ,

Comments

# Common Table Expressions (CTE's) - How it works; How Recursion Works; Using with Adjacency List

Recursion in Common Table Expressions (CTE’s), how does it work? How can I use it with the adjacency

# Bug in inline expansion of non-deterministic functions in derived tables and CTE's causes incorrect results

Using non-deterministic functions in CTE&#39;s gives incorrect results, this follows on from two things

# Question on building a table | keyongtech

Pingback from  Question on building a table | keyongtech

# CTE Question | keyongtech

23 March 2009 16:07 by CTE Question | keyongtech

Pingback from  CTE Question | keyongtech