in

SQL Server Blogs

Voices from the UK SQL Server Community

Jason Massie's SQL blog

  • SQL 2008 CTP6

  • SQL Server LinkedIn group

    I have created a LinkedIn SQL Server group. Join up.

    SQL Server Professionals

  • Creating a custom data collection in SQL Server 2008

    In the first post, we took a quick look at how data collection works. In this post, we will see how to create custom data collection.

    To create a custom collection, you must use the stored procedures. In Bill Ramos's web cast, he hints it may stay like this i.e. NO GUI. I tend to doubt it because Microsoft built their empire making hard stuff easy.

    In this example, we will look at collecting 3 key indicators of a CPU bottleneck. These items are actually included in the "server activity" system data collection. However, it cannot be modified and it collects everything under the kitchen sink. Multiply that times 20 or 100 servers and you are looking at some huge storage requirements. I think smaller data collections like this will be common.

    Let's get to business...

    use msdb;

     --Let's create the collection set

     Declare @collection_set_id_1 int

    Declare @collection_set_uid_2 uniqueidentifier

     
    EXEC [dbo].[sp_syscollector_create_collection_set]

     
          @name=N'CPU indicators',

     
          @collection_mode=0, --Let's start in cached mode.

     
          @description=N'Collects CPU KPIs from perfmon and DMVs',

     
          @target=N'', --Undocumented

     
          @logging_level=0, --0 through 2 are valid

     
          @days_until_expiration=5, --Let's just keep data 5 days. We will rollup for reporting.

     
          @proxy_name=N'', --Use if you want it to run under something other than the SQL Agent svc account.

     
          @schedule_name=N'CollectorSchedule_Every_5min', --Built in schedule

     
          @collection_set_id=@collection_set_id_1 OUTPUT,

     
          @collection_set_uid=@collection_set_uid_2 OUTPUT

     
    Select @collection_set_id_1, @collection_set_uid_2

     
     --Let's get the needed perfmon counters

     
    Declare @collector_type_uid_7 uniqueidentifier

     
    Select @collector_type_uid_7 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';

    Declare @collection_item_id_8 int

     
    EXEC [dbo].[sp_syscollector_create_collection_item]

     
          @name=N'Perfmon CPU counters',

     
          @parameters=N'

     
    <PerformanceCountersCollector>

     
      <PerformanceCounters Objects="SYSTEM" Counters="Processor Queue Length" Instances="*" />     

     
      <PerformanceCounters Objects="Processor" Counters="% Processor Time" Instances="*" />

     
    </PerformanceCountersCollector>',

     
          @collection_item_id=@collection_item_id_8 OUTPUT,

     
          @frequency=5,

     
          @collection_set_id=@collection_set_id_1, --Output from sp_syscollector_create_collection_set

     
          @collector_type_uid=@collector_type_uid_7

     
    Select @collection_item_id_8

     
    --Let's get the DMV data

    Declare @collector_type_uid_3 uniqueidentifier

     
    Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Generic T-SQL Query Collector Type';

     
    Declare @collection_item_id_4 int

    EXEC [dbo].[sp_syscollector_create_collection_item]

     

          @name=N'CPU pressure check',

     
          @parameters=

     
    N'<TSQLQueryCollector>

     
    <Query>

     
          <Value>select SUM(runnable_tasks_count) from sys.dm_os_schedulers where scheduler_id >2

     
          </Value>

     
          <OutputTable>CPU_Pressure_OS_Schedulers</OutputTable>

     
    </Query>

     
    <Databases UseSystemDatabases="true" UseUserDatabases="true" />

     
    </TSQLQueryCollector>',

     
          @collection_item_id=@collection_item_id_4 OUTPUT,

     
          @frequency=5,

     
          @collection_set_id=@collection_set_id_1, --Output from sp_syscollector_create_collection_set

     
          @collector_type_uid=@collector_type_uid_3

     
    Select @collection_item_id_4

     
    --Let's start the collection

     exec sp_syscollector_start_collection_set @collection_set_id = @collection_set_id_1 --Output from sp_syscollector_create_collection_set

     

     We can now see that the collection has been created in SSMS. This is what it looks like:

     
    image

    That's it. In the next part, we will look at reporting off of this collection through tsql and SSRS.

    Technorati Tags:
  • SQL Server 2008 performance data collection in 5 minutes

    This will be a quick and dirty post on data collection in SQL Server 2008. The next two posts will cover creating custom data collection sets and integrating reporting services. For a deeper understanding of data collections, check out Bill Ramos's web cast. He talks about it in this post.

    Data collection is basically a separate application that collects data about your SQL Server 2008 server, stores in SQL Server and provides reporting. It is like some of those 3rd party applications by vendors like Quest and Idera. It is fairly simple in design. This will allow you to easily get up to speed and extend. That is where it may excel over the 3rd party products.

    The moving parts include an executable called dcexe.exe, a warehouse database, SSIS, SQL Agent jobs and your data sources. The sources can be perfmon counters, DMV's,SQL traces and even application data. It was designed to minimize the stress on the source system but the warehouse database should reside on a different system.

    This feature comes with 3(as of right now) system collections.

    • Disk usage - this one basically logs the info from a sp_spaceused for each db. They say that disk space related performance counters may end up in here as well.
    • Query Statistics - This is "notable" queries from the procedure cache.
    • Server Activity - This collects data from system and OS perform counters as well as the SQLOS DMV's.

    For a deeper discussion of this topic, please check out the web cast. The next post will demo setting up a custom collection. Part 3 will be on reporting off of the data collection.

     

    Technorati Tags:

     

    * Cross posted from http://statisticsio.com *

  • Migrating your SQL Cluster to a new SAN

    There is a nice utility that does a lot of the heavy lifting for you when migrating your cluster to a new SAN. It is ClusterRecovery.exe. It is designed for replacing a failed disk but when you migrate to a new SAN, you have to go through the same steps. It basically moves all of the meta data like dependencies. The utility can be found the Win2k3 resource kit.

    The basic steps are:

    1. Present storage.

    2. Partition\format\letter. Pick any free letter. We will swap letters later. Let's choose h:\.

    3. Stop SQL and move data from old SAN to new SAN.

    4. Add the h:\ drive to  the SQL cluster group.

    5. Run the cluster recovery tool, connect to the cluster. Choose the old and new drive.

    image

    6. Open disk management, change the original g:\ drive to a different letter.

    7. Change the new drive's letter to the letter of the original drive's letter. h:\ to g:\ in our example.

    8. Bring the drives and services online.

    9. Remove the old drive resource with the "(lost)" suffice.

     

    That is about it.

    * Cross posted from http://statisticsio.com *

  • 2008 to RTM next week

    Windows Server 2008, that is.

    "

    Windows Server 2008 (and Vista Service Pack 1) are slated to release to manufacturing (RTM) on Feb. 6, according to several sources. That gives the company plenty of time to churn out disks for distribution at the event, which Microsoft executives have characterised as the company's "biggest enterprise launch ever."

    The early February RTM means that the long-awaited server operating system will be available for the big event. Visual Studio 2008 is already out. Microsoft last week said SQL Server 2008 has slipped into the third quarter. Previously, the company said the database would be available in the second quarter. Microsoft has also promised to deliver its new database release in 36 to 48 months going forward. SQL Server 2003 shipped in November of that year."

    The full story can be found here.

     

    Click to IM Jason Massie

     

    View Jason Massie's LinkedIn profile

     

     

    * cross posted from http://statisticsio.com *

  • Never use table variables?

    MS pushed tables variables too hard back in 2000 and developers went a little crazy with them. However, we found out that they were not the greatest thing since sliced bread especially when the result set is more that a few records and the query is even mildly complex.

    The other case we hear for table variables is to avoid recompilations. This was true in SQL Server 2000. This has changed somewhat in SQL 2005 but you might not realize this by reading some web sites out there. On top of that, I cannot reproduce recompiles until much higher thresholds than what we should see per the documentation. This is a good thing in most scenarios IMO.

    Based on this blog post, which is part of a great procedure cache series, we should see a recompile when 6 rows change, 500 more and at 500 + 0.20 * n more where n is the cardinality of the table.

    "

    After 6 modifications to an empty temporary table any stored procedure referencing that temporary table will need to be recompiled because the temporary table statistics needs to be refreshed.

    The recompilation threshold for a table partly determines the frequency with which queries that refer to the table recompile. Recompilation threshold depends on the table type (permanent vs temporary), and the cardinality (number of rows in the table) when a query plan is compiled. The recompilation thresholds for all of the tables referenced in a batch are stored with the query plans of that batch.

    Recompilation threshold is calculated as follows for temporary tables: n is the cardinality of the temporary table when the query plan is compiled.

    If n < 6, Recompilation threshold = 6.

    If 6 <= n <= 500, Recompilation threshold = 500.

    If n > 500, Recompilation threshold = 500 + 0.20 * n.

    "

    That blog post mirrors the numbers in this must read white paper.  Both the blog and the white paper, use this example.

    create procedure RowCountDemo 
    as
    begin
        create table #t1 (a int, b int)

        declare @i int
        set @i = 0    while (@i < 20)
        begin
           insert into #t1 values (@i, 2*@i - 50)

           select a
           from #t1
           where a < 10 or ((b > 20 or a >=100) and (a < 10000))
           group by a
     
           set @i = @i + 1
        end
    end

     

    Now here is the interesting part... I cannot get it to recompile. I have tried on SQL 2005 RTM, sp2 and sp2 + 3054. The initial run shows up as a recompile in a trace but subsequent runs do not. Not at @i = 100, 500 or 1000. At precisely @i = 1108, recompilation happens every time.

    alter procedure RowCountDemo
    as
    begin
        create table #t1 (a int, b int)
     
        declare @i int
        set @i = 0    while (@i < 1108)
        begin
           insert into #t1 values (@i, 2*@i - 50)
     
           select a
           from #t1 
           where a < 10 or ((b > 20 or a >=100) and (a < 10000))
           group by a
     
           set @i = @i + 1
        end
    end 

    Now there may be something totally flawed in my understanding. I am sure you guys will point that out if it is the case :) But the white paper states:

    "Recall that the recompilation threshold for a temporary table is 6 when the table is empty when the threshold is calculated. When RowCountDemo is executed, a "statistics changed"-related recompilation can be observed after #t1 contains exactly 6 rows. By changing the upper bound of the "while" loop, more recompilations can be observed."

    If temp tables really do not cause recompilations at 6 rows, I really cannot think of a good reason to use table variables except for small sets and then only out of preference.

    Technorati Tags: ,

    IMMe: Jason Massie Click to IM Jason Massie

     View Jason Massie's profile on LinkedIn

    Subscribe

     * cross posted from http://statisticsio.com *

  • SQL Server 2008 RTM delayed until Q3

    "

    Microsoft is excited to deliver a feature complete CTP during the Heroes Happen Here launch wave and a release candidate (RC) in Q2 calendar year 2008, with final Release to manufacturing (RTM) of SQL Server 2008 expected in Q3. Our goal is to deliver the highest quality product possible and we simply want to use the time to meet the high bar that you, our customers, expect.

    "

    Read more

     *cross posted from http://statisticsio.com *

     

     

  • The problem with local variables

    Have you ever been writing a query and just cannot get it to use the right index? This could be one of the reasons why. Let's use this query with local variables as our example.

    declare @Start datetime
    declare @End datetime
    select @Start = '2004-08-01 00:00:00.000'
    select @End = '2004-07-28 00:00:00.000'
    select ProductID from sales.SalesOrderDetail
    where ModifiedDate >= @End and ModifiedDate <= @Start
     

    SQL Server Clustered index scan

    but we have an index on ModifiedDate. There are many reason why SQL would not use this index but, for this post, we will assume we have eliminated them. Finally, we hard code the dates and we get this plan.

    SQL Server index seek with bookmark lookup

    So why is it doing this? The reason is because the query optimizer cannot accurately use the statistics to estimate how many rows are returned with local variables. Let's look at how we can tell there is a problem with the cardinality estimates. In the query with the local variables, the optimizer thinks we are getting 10918.5 rows so we do the index scan. In the query with hard coded literals, the estimated rows and actual rows are the same and accurate.

    SQL Server Cardinality underestimation

    How can we fix this? There are several way. This is another situation that makes a case for stored procedures or parameterized queries.

    create proc pDemo01
    @Start datetime,
    @End datetime
    as
    select ProductID from sales.SalesOrderDetail
    where ModifiedDate >= @End and ModifiedDate <= @Start


    exec pDemo01 '2004-08-01 00:00:00.000', '2004-07-28 00:00:00.000'

    The stored proc generates the proper plan. However, you will run into the same problem if you modify the parameter within the stored proc like select @start = @start-90. In this case, to should use sp_executeSQL. What if you cannot use a stored proc because it is a 3rd party app or some other reason? A covering index is probably the solution. Once we create this index, it will always be used:

    create index ix01 on sales.SalesOrderDetail(ModifiedDate) include (ProductID)

    We could use a plan guide or an index hint with a forceseek(SQL 2008)  but performance will be really bad when we really do need to get 10k rows. The same problem can happen with stored proc's but that is another post.

    To get deeper into this subject, check out this.

     *Cross posted from http://statisticsio.com *
     

     

     
     
     
  • SQL Server 2008 Plan Guides from Cache

    Uhoh... I can see some junior developers going crazy with this. One of the things that kept plan guides from being over used was the fact that they are kind of hard :) Well, Microsoft built their empire making hard stuff easy. They do it again with sp_create_plan_guide_from_cache.

    Let's look at this BOL sample.

     

    USE AdventureWorks;

    GO

    SELECT WorkOrderID, p.Name, OrderQty, DueDate

    FROM Production.WorkOrder AS w

    JOIN Production.Product AS p ON w.ProductID = p.ProductID

    WHERE p.ProductSubcategoryID > 4

    ORDER BY p.Name, DueDate;

    GO

    -- Inspect the query plan by using dynamic management views.

    SELECT * FROM sys.dm_exec_query_stats AS qs

    CROSS APPLY sys.dm_exec_sql_text(sql_handle)

    CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

    WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

    GO

    -- Create a plan guide for the query by specifying the query plan in the plan cache.

    DECLARE @plan_handle varbinary(64);

    DECLARE @offset int;

    SELECT @plan_handle = plan_handle, @offset = qs.statement_start_offset

    FROM sys.dm_exec_query_stats AS qs

    CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st

    CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

    WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

     

    EXECUTE sp_create_plan_guide_from_cache

        @name =  N'Guide1',

        @plan_handle = @plan_handle,

        @statement_start_offset = @offset;

    GO

    -- Verify that the plan guide is created.

    SELECT * FROM sys.plan_guides

    WHERE scope_batch LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

    GO

     

    --Let's verify it actually worked.

    --Click the xml link

    --Save as a .sqlplan, reopen in SSMS and then hit f4

    set statistics xml on

    go

    SELECT WorkOrderID, p.Name, OrderQty, DueDate

    FROM Production.WorkOrder AS w

    JOIN Production.Product AS p ON w.ProductID = p.ProductID

    WHERE p.ProductSubcategoryID > 4

    ORDER BY p.Name, DueDate;

    GO

    So when would you use this? I would say hardly ever hopefully. However, it could solve the really tough problems.

    Let's say you have a 3rd party application that generates adhoc dynamic SQL. You cannot modify the code or schema. Index changes are not supported. Sometimes parameter sniffing causes unpredictable performance. Sound like a nightmare? Welcome to most CRM apps.

    Other scenarios that come to mind are when best practices are not or cannot be followed. Let's say you just cannot update stats often enough with a large enough sample on a very very VERY large table to get a consistently optimal plan. Use a plan guide!

    Here are some other times that the optimizer might have trouble and a plan guide may be a good option.

    • Use of local variables
    • Modifying stored proc parameters.
    • Ascending keys
    • Complex queries with table variables

    There are usually better solutions than plan guides so save them for times that best practices are not an option. sp_create_plan_guide_from_cache makes using plan guides so much easier. Put it in your toolbox!

     

    **Cross Posted from http://statisticsio.com **

Powered by Community Server (Commercial Edition), by Telligent Systems