SQL and the like

Dave Ballantyne's blog. Freelance SQL Server database designer and developer at Clear Sky SQL

November 2010 - Posts

Denali Paging–Key seek lookups

In my previous post “Denali Paging – is it win.win ?” I demonstrated the use of using the Paging functionality within Denali.  On reflection,  I think i may of been a little unfair and should of continued always planned to continue my investigations to the next step.

In Pauls article, he uses a combination of ctes to first scan the ordered keys which is then filtered using TOP and rownumber and then uses those keys to seek the data.  So what happens if we replace the scanning portion of the code with the denali paging functionality.

Heres the original procedure,  we are going to replace the functionality of the Keys and SelectedKeys ctes :

CREATE  PROCEDURE dbo.FetchPageKeySeek
       
@PageSize   BIGINT,
       
@PageNumber BIGINT
AS
BEGIN
       
-- Key-Seek algorithm
       
WITH    Keys
       
AS      (
               
-- Step 1 : Number the rows from the non-clustered index
                -- Maximum number of rows = @PageNumber * @PageSize
               
SELECT  TOP (@PageNumber * @PageSize)
                       
rn = ROW_NUMBER() OVER (ORDER BY P1.post_id ASC),
                       
P1.post_id
               
FROM    dbo.Post P1
               
ORDER   BY
                       
P1.post_id ASC
               
),
               
SelectedKeys
       
AS      (
               
-- Step 2 : Get the primary keys for the rows on the page we want
                -- Maximum number of rows from this stage = @PageSize
               
SELECT  TOP (@PageSize)
                       
SK.rn,
                       
SK.post_id
               
FROM    Keys SK
               
WHERE   SK.rn > ((@PageNumber - 1) * @PageSize)
               
ORDER   BY
                       
SK.post_id ASC
               
)
       
SELECT  -- Step 3 : Retrieve the off-index data
                -- We will only have @PageSize rows by this stage
               
SK.rn,
               
P2.post_id,
               
P2.thread_id,
               
P2.member_id,
               
P2.create_dt,
               
P2.title,
               
P2.body
       
FROM    SelectedKeys SK
       
JOIN    dbo.Post P2
               
ON  P2.post_id = SK.post_id
       
ORDER   BY
               
SK.post_id ASC;
END;

and here is the replacement procedure using paging:

CREATE  PROCEDURE dbo.FetchOffsetPageKeySeek
       
@PageSize   BIGINT,
       
@PageNumber BIGINT
AS
BEGIN
       
-- Key-Seek algorithm
       
WITH    SelectedKeys
       
AS      (
               
SELECT  post_id
               
FROM    dbo.Post P1
               
ORDER   BY post_id ASC
               
OFFSET  @PageSize * (@PageNumber-1) ROWS
               
FETCH NEXT @PageSize ROWS ONLY
               
)
       
SELECT  P2.post_id,
               
P2.thread_id,
               
P2.member_id,
               
P2.create_dt,
               
P2.title,
               
P2.body
       
FROM    SelectedKeys SK
       
JOIN    dbo.Post P2
               
ON  P2.post_id = SK.post_id
       
ORDER   BY
               
SK.post_id ASC;
END;

Notice how all i have done is replace the functionality with the Keys and SelectedKeys CTEs with the paging functionality.

So , what is the comparative performance now ?.

offsetfetch

Exactly the same amount of IO and memory usage , but its now pretty obvious that in terms of CPU and overall duration we are onto a winner.

 

 

Denali Paging–Is it win/win ?

UPDATE : 2012-07-26 : My latest commentry of the paging subject can be found here :  http://sqlblogcasts.com/blogs/sqlandthelike/archive/2012/04/26/offset-without-offset.aspx

 

I'm not one for automatically poo-pooing new ideas, new practices and new technology.  Rather I like to take a measured view and try to evaluate based upon merits.

One of the new features in Denali is paging which is now supported by the FETCH option of the ORDER BY clause.  On the face of it, it does seem like a really neat addition to the product set,  but as ever we do need to find out what the real costs associated with a query are.

So, using Paul Whites (blog|twitter) rather excellent series on paging as a comparison base ive created a simple stored proc using FETCH.

DROP PROCEDURE dbo.DenaliPageFetch
GO
CREATE PROCEDURE dbo.DenaliPageFetch   @PageSize   BIGINT,
                                      
@PageNumber BIGINT

AS

SELECT 
P2.post_id,
       
P2.thread_id,
       
P2.member_id,
       
P2.create_dt,
       
P2.title,
       
P2.body
FROM    dbo.Post P2
ORDER   BY p2.post_id ASC
OFFSET  @PageSize * (@PageNumber-1) ROWS
FETCH NEXT @PageSize ROWS ONLY

and then i’ve compared the performance of Paul’s FetchPageKeySeek procedure to my new DenaliPageFetch.  Starting of at a low range , 1st page of 50 rows, the denali option does indeed give better performance , but by jumping to the 200th page of 50 rows we see a rather different picture emerge.

image

I think those figures speak for themselves,  so does the execution plan give a reason for this high cost ?

image

 

Yup, that’s right is has scanned through the top 10000 rows.  Admittedly this is only CTP1 and maybe Microsoft will invest some more time and effort into this area , but right now im not going to.