in

SQL Server Blogs

Voices from the UK SQL Server Community

Jason Massie's SQL blog

January 2008 - Posts

  • 2008 to RTM next week

    Windows Server 2008, that is.

    "

    Windows Server 2008 (and Vista Service Pack 1) are slated to release to manufacturing (RTM) on Feb. 6, according to several sources. That gives the company plenty of time to churn out disks for distribution at the event, which Microsoft executives have characterised as the company's "biggest enterprise launch ever."

    The early February RTM means that the long-awaited server operating system will be available for the big event. Visual Studio 2008 is already out. Microsoft last week said SQL Server 2008 has slipped into the third quarter. Previously, the company said the database would be available in the second quarter. Microsoft has also promised to deliver its new database release in 36 to 48 months going forward. SQL Server 2003 shipped in November of that year."

    The full story can be found here.

     

    Click to IM Jason Massie

     

    View Jason Massie's LinkedIn profile

     

     

    * cross posted from http://statisticsio.com *

  • Never use table variables?

    MS pushed tables variables too hard back in 2000 and developers went a little crazy with them. However, we found out that they were not the greatest thing since sliced bread especially when the result set is more that a few records and the query is even mildly complex.

    The other case we hear for table variables is to avoid recompilations. This was true in SQL Server 2000. This has changed somewhat in SQL 2005 but you might not realize this by reading some web sites out there. On top of that, I cannot reproduce recompiles until much higher thresholds than what we should see per the documentation. This is a good thing in most scenarios IMO.

    Based on this blog post, which is part of a great procedure cache series, we should see a recompile when 6 rows change, 500 more and at 500 + 0.20 * n more where n is the cardinality of the table.

    "

    After 6 modifications to an empty temporary table any stored procedure referencing that temporary table will need to be recompiled because the temporary table statistics needs to be refreshed.

    The recompilation threshold for a table partly determines the frequency with which queries that refer to the table recompile. Recompilation threshold depends on the table type (permanent vs temporary), and the cardinality (number of rows in the table) when a query plan is compiled. The recompilation thresholds for all of the tables referenced in a batch are stored with the query plans of that batch.

    Recompilation threshold is calculated as follows for temporary tables: n is the cardinality of the temporary table when the query plan is compiled.

    If n < 6, Recompilation threshold = 6.

    If 6 <= n <= 500, Recompilation threshold = 500.

    If n > 500, Recompilation threshold = 500 + 0.20 * n.

    "

    That blog post mirrors the numbers in this must read white paper.  Both the blog and the white paper, use this example.

    create procedure RowCountDemo 
    as
    begin
        create table #t1 (a int, b int)

        declare @i int
        set @i = 0    while (@i < 20)
        begin
           insert into #t1 values (@i, 2*@i - 50)

           select a
           from #t1
           where a < 10 or ((b > 20 or a >=100) and (a < 10000))
           group by a
     
           set @i = @i + 1
        end
    end

     

    Now here is the interesting part... I cannot get it to recompile. I have tried on SQL 2005 RTM, sp2 and sp2 + 3054. The initial run shows up as a recompile in a trace but subsequent runs do not. Not at @i = 100, 500 or 1000. At precisely @i = 1108, recompilation happens every time.

    alter procedure RowCountDemo
    as
    begin
        create table #t1 (a int, b int)
     
        declare @i int
        set @i = 0    while (@i < 1108)
        begin
           insert into #t1 values (@i, 2*@i - 50)
     
           select a
           from #t1 
           where a < 10 or ((b > 20 or a >=100) and (a < 10000))
           group by a
     
           set @i = @i + 1
        end
    end 

    Now there may be something totally flawed in my understanding. I am sure you guys will point that out if it is the case :) But the white paper states:

    "Recall that the recompilation threshold for a temporary table is 6 when the table is empty when the threshold is calculated. When RowCountDemo is executed, a "statistics changed"-related recompilation can be observed after #t1 contains exactly 6 rows. By changing the upper bound of the "while" loop, more recompilations can be observed."

    If temp tables really do not cause recompilations at 6 rows, I really cannot think of a good reason to use table variables except for small sets and then only out of preference.

    Technorati Tags: ,

    IMMe: Jason Massie Click to IM Jason Massie

     View Jason Massie's profile on LinkedIn

    Subscribe

     * cross posted from http://statisticsio.com *

  • SQL Server 2008 RTM delayed until Q3

    "

    Microsoft is excited to deliver a feature complete CTP during the Heroes Happen Here launch wave and a release candidate (RC) in Q2 calendar year 2008, with final Release to manufacturing (RTM) of SQL Server 2008 expected in Q3. Our goal is to deliver the highest quality product possible and we simply want to use the time to meet the high bar that you, our customers, expect.

    "

    Read more

     *cross posted from http://statisticsio.com *

     

     

  • The problem with local variables

    Have you ever been writing a query and just cannot get it to use the right index? This could be one of the reasons why. Let's use this query with local variables as our example.

    declare @Start datetime
    declare @End datetime
    select @Start = '2004-08-01 00:00:00.000'
    select @End = '2004-07-28 00:00:00.000'
    select ProductID from sales.SalesOrderDetail
    where ModifiedDate >= @End and ModifiedDate <= @Start
     

    SQL Server Clustered index scan

    but we have an index on ModifiedDate. There are many reason why SQL would not use this index but, for this post, we will assume we have eliminated them. Finally, we hard code the dates and we get this plan.

    SQL Server index seek with bookmark lookup

    So why is it doing this? The reason is because the query optimizer cannot accurately use the statistics to estimate how many rows are returned with local variables. Let's look at how we can tell there is a problem with the cardinality estimates. In the query with the local variables, the optimizer thinks we are getting 10918.5 rows so we do the index scan. In the query with hard coded literals, the estimated rows and actual rows are the same and accurate.

    SQL Server Cardinality underestimation

    How can we fix this? There are several way. This is another situation that makes a case for stored procedures or parameterized queries.

    create proc pDemo01
    @Start datetime,
    @End datetime
    as
    select ProductID from sales.SalesOrderDetail
    where ModifiedDate >= @End and ModifiedDate <= @Start


    exec pDemo01 '2004-08-01 00:00:00.000', '2004-07-28 00:00:00.000'

    The stored proc generates the proper plan. However, you will run into the same problem if you modify the parameter within the stored proc like select @start = @start-90. In this case, to should use sp_executeSQL. What if you cannot use a stored proc because it is a 3rd party app or some other reason? A covering index is probably the solution. Once we create this index, it will always be used:

    create index ix01 on sales.SalesOrderDetail(ModifiedDate) include (ProductID)

    We could use a plan guide or an index hint with a forceseek(SQL 2008)  but performance will be really bad when we really do need to get 10k rows. The same problem can happen with stored proc's but that is another post.

    To get deeper into this subject, check out this.

     *Cross posted from http://statisticsio.com *
     

     

     
     
     
  • SQL Server 2008 Plan Guides from Cache

    Uhoh... I can see some junior developers going crazy with this. One of the things that kept plan guides from being over used was the fact that they are kind of hard :) Well, Microsoft built their empire making hard stuff easy. They do it again with sp_create_plan_guide_from_cache.

    Let's look at this BOL sample.

     

    USE AdventureWorks;

    GO

    SELECT WorkOrderID, p.Name, OrderQty, DueDate

    FROM Production.WorkOrder AS w

    JOIN Production.Product AS p ON w.ProductID = p.ProductID

    WHERE p.ProductSubcategoryID > 4

    ORDER BY p.Name, DueDate;

    GO

    -- Inspect the query plan by using dynamic management views.

    SELECT * FROM sys.dm_exec_query_stats AS qs

    CROSS APPLY sys.dm_exec_sql_text(sql_handle)

    CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

    WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

    GO

    -- Create a plan guide for the query by specifying the query plan in the plan cache.

    DECLARE @plan_handle varbinary(64);

    DECLARE @offset int;

    SELECT @plan_handle = plan_handle, @offset = qs.statement_start_offset

    FROM sys.dm_exec_query_stats AS qs

    CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st

    CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

    WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

     

    EXECUTE sp_create_plan_guide_from_cache

        @name =  N'Guide1',

        @plan_handle = @plan_handle,

        @statement_start_offset = @offset;

    GO

    -- Verify that the plan guide is created.

    SELECT * FROM sys.plan_guides

    WHERE scope_batch LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

    GO

     

    --Let's verify it actually worked.

    --Click the xml link

    --Save as a .sqlplan, reopen in SSMS and then hit f4

    set statistics xml on

    go

    SELECT WorkOrderID, p.Name, OrderQty, DueDate

    FROM Production.WorkOrder AS w

    JOIN Production.Product AS p ON w.ProductID = p.ProductID

    WHERE p.ProductSubcategoryID > 4

    ORDER BY p.Name, DueDate;

    GO

    So when would you use this? I would say hardly ever hopefully. However, it could solve the really tough problems.

    Let's say you have a 3rd party application that generates adhoc dynamic SQL. You cannot modify the code or schema. Index changes are not supported. Sometimes parameter sniffing causes unpredictable performance. Sound like a nightmare? Welcome to most CRM apps.

    Other scenarios that come to mind are when best practices are not or cannot be followed. Let's say you just cannot update stats often enough with a large enough sample on a very very VERY large table to get a consistently optimal plan. Use a plan guide!

    Here are some other times that the optimizer might have trouble and a plan guide may be a good option.

    • Use of local variables
    • Modifying stored proc parameters.
    • Ascending keys
    • Complex queries with table variables

    There are usually better solutions than plan guides so save them for times that best practices are not an option. sp_create_plan_guide_from_cache makes using plan guides so much easier. Put it in your toolbox!

     

    **Cross Posted from http://statisticsio.com **

Powered by Community Server (Commercial Edition), by Telligent Systems