SQL Server Blogs

Voices from the UK SQL Server Community
Welcome to SQL Server Blogs Sign in | Help
in Search

My two cents

Tips, opinions, tests (and misspellings) from a Brazilian DBA in UK
  • How atomic is a SELECT INTO?

    Last week I got an interesting situation that prompted me to challenge a long standing assumption. I always thought that a SELECT INTO was an atomic statement, i.e. it would either complete successfully or the table would not be created.

    So I got very surprised when, after a “select into” query was chosen as a deadlock victim, the next execution (as the app would handle the deadlock and retry) would fail with:

    Msg 2714, Level 16, State 6, Line 1
    There is already an object named '#test' in the database.

    The only hypothesis we could come up was that the “create table” part of the statement was committed independently from the actual “insert”. We can confirm that by capturing the “Transaction Log” event on Profiler (filtering by SPID0). The result is that when we run:

    SELECT * INTO #results FROM master.sys.objects

    we get the following output on Profiler:

    image

    It is easy to see the two independent transactions. Although this behaviour was a surprise to me, it is very easy to workaround it if you feel the need (as we did in this case). You can either change it into independent “CREATE TABLE / INSERT SELECT” or you can enclose the SELECT INTO in an explicit transaction:

    SET XACT_ABORT ON
    BEGIN TRANSACTION
    SELECT * INTO #results FROM master.sys.objects
    COMMIT

    image

  • Coming back from (blog) retirement

    So, it has been more than 3 years without a single blog post. I wished I could have a decent excuse for it, but in the end, I guess it boils down to laziness and procrastination. :-)

    Even though I learned a lot in that period (and added a feel tricks to my bag), I couldn’t find the will to sit down and write. I hope all my readers (yes mom and dad, I’m talking to you) have not been disappointed.

    I will try really hard not to let routine take over. I don’t expect I will be the most active blogger in the community, but hopefully a couple of posts per month is a good target to aim.

    Posted May 22 2012, 10:30 PM by leo.pasta with no comments
    Filed under:
  • Getting all the Clustered keys in the database

    Jeremiah Peschka (http://twitter.com/peschkaj) asked on twitter if anyone had a script to list all clustered keys in the database. This is my take on it, I hope it helps someone:       

    SELECT    i.name '(' +
               
    STUFF( (
            
    SELECT ',' c.name CASE WHEN is_descending_key THEN ' ASC ' 
                
    ELSE ' DESC ' END
                FROM 
    sys.index_columns ic
                        
    JOIN sys.columns c   ON ic.OBJECT_ID c.OBJECT_ID
                    
    AND ic.column_id c.column_id
                
    WHERE i.OBJECT_ID ic.OBJECT_ID
                    
    AND i.index_id ic.index_id
                    
    AND key_ordinal 0
                
    ORDER BY key_ordinal
                        
    FOR XML PATH(''--This is a trick that generated comma-separated lists (but begining with a spurious comma)
            
    )
    ,1,1,'') + ')'+
                
    ISNULL('(' STUFF((
                
    SELECT ',' c.name
                    
    FROM sys.index_columns ic
                            
    JOIN sys.columns c   ON ic.OBJECT_ID c.OBJECT_ID
                        
    AND ic.column_id c.column_id
                    
    WHERE i.OBJECT_ID ic.OBJECT_ID
                        
    AND i.index_id ic.index_id
                        
    AND is_included_column 1
                    
    ORDER BY key_ordinal
                            
    FOR XML PATH('')) + ')',1,1,' INCLUDE ( '),'')
                + 
    ' ON [' +
        (
        
    SELECT CASE
                        
    WHEN i.data_space_id 65000 THEN (
                
    SELECT d.name 
                    
    FROM sys.data_spaces d 
                    
    WHERE i.data_space_id d.data_space_id)
            
    ELSE (
                
    SELECT fg.name
                    
    FROM sys.data_spaces d
                            
    JOIN sys.destination_data_spaces dd ON d.data_space_id dd.partition_scheme_id
                        
    AND dd.destination_id 1
                            
    JOIN sys.data_spaces fg ON dd.data_space_id fg.data_space_id
                    
    WHERE i.data_space_id d.data_space_id)
            
    END
        
    ) + ']'
        
    FROM sys.indexes i
        
    WHERE i.type_desc 'CLUSTERED'
            
    AND OBJECTPROPERTY(i.OBJECT_ID,'isUserTable'1
  • Reducing Transaction Log fragmentation

    I am taking Kimberly Tripp and Paul Randal's course on SQL Server Performance & Availability this week in Hatfield and we had an interesting discussion on Transaction Log internal fragmentation. Kimberly summarized the issue, together with other Transaction Log best practices, here (check item 8). Based on that discussion, and as I was bitten by this issue before, I thought it might be interesting to write a procedure to help verify and fix it.

    The procedure is called sp_DBA_DefragLog and will report your current fragmentation and fix it, recreating your transaction log in reasonably sized increments (as creating huge transaction logs in one go can have its issues as well). The procedure has the following syntax:

    sp_DBA_DefragLog { @Action=['Report'|'Simulate'|'Fix'] } {, @TargetMb = value}

    Where @Action='Simulate' prints the commands the 'Fix' would execute. @TargetMB specifies the desired size for the Transaction Log at the end of the process. If none is specified, the current size is kept.

    It is important to note that, for simplicity, I decided to leave to the admin the task of clearing its transaction log if the database is on FULL or BULK-LOGGED. In this case you probably should run your periodic log backup before running the sp with Fix option.

    I  don't have instances of SQL2005 and SQL2000 to test right now, but I expect it to run on 2005 (some adjustments might be necessary). SQL 2000 will need some more work, which I plan to do when I'm back at work, as I don't have SQL2000 on my notebook anymore.

    BTW, If you can, I definitely recommend taking Kimberly and Paul course. There were a lot of very interesting discussions that if I were less lazy would have resulted in at least half a dozen blog posts.

  • Which queries are missing indexes?

    One of the things I really enjoy when doing performance tuning on 2005 (I still work on a mix of several SQL 2000 and some SQL 2005) is the sys.dm_db_missing_index* DMVs. As the query processor evaluates queries, it detects if that specific query could benefit from an index and how much it expect that index would reduce the cost (in terms of IO), exposing these information as views that we can query. I won't delve in the structure of it, but you can use my procedure sp_dba_missingindex as an example (based on Bart Duncan's work). It will return the suggested CREATE INDEX statements along with the expected improvement information, like this:

    USE AdventureWorks
    exec sp_DBA_MissingIndexes

    GO

    create_index_statement
    improvement_measure                     unique_compiles      user_seeks           user_scans           avg_user_impact        avg_system_impact      last_user_seek          last_user_scan


    CREATE INDEX missing_index_4_3 ON [AdventureWorks].[Sales].[SalesOrderDetail] ([LineTotal]) INCLUDE ([SalesOrderID], [SalesOrderDetailID], [OrderQty])
    564.7                                   1                    5                    0                    99.66                  0                      2008-05-03 21:55:21.573 NULL
    CREATE INDEX missing_index_2_1 ON [AdventureWorks].[Sales].[SalesOrderHeader]([PurchaseOrderNumber])
    283.1                                   1                    5                    0                    99.32                  0                      2008-05-03 21:55:21.557 NULL

     

    Nevertheless, I don't like to put an index in my schema without knowing exactly which query it is supposed to improve. How would I know if it actually helped? How can I test it? Am I sure it won't make matters worse?

    So, inspired by Greg Linwood procedure to find which queries are doing large index scans, I thought that I could use tweak that idea to answer those questions. The trick is that SQL Server query processor will include the missing index information in the showplan output as well, so I can parse the showplan XML for occurrences of missing indexes entries on the table I want. This way I an see the actual query and judge if that is indeed the correct index or if adjustments are necessary (the code is available to download here):

    USE AdventureWorks
    EXEC sp_DBA_QueriesMissingIndexes 'SalesOrderHeader',1

    EXEC sp_DBA_QueriesMissingIndexes 'SalesOrderDetail',1
    GO

    Statement
    Stored Procedure
    query_plan
    execution_count      min_logical_reads    max_logical_reads    avg_logical_reads    min_logical_writes   max_logical_writes   avg_logical_writes   min_worker_time      max_worker_time      avg_worker_time      min_elapsed_time     max_elapsed_time     avg_elapsed_time


    )select CustomerID , SalesOrderNumber , SubTotal from [Sales] . [SalesOrderHeader] where ShipMethodID > @0 ......
    NULL
    <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.0".........
    1                    703                  703                  703                  0                    0                    0                    26669                26669                26669                176983               176983               176983

    (1 row(s) affected)

     

    Statement
    Stored Procedure
    query_plan
    execution_count      min_logical_reads    max_logical_reads    avg_logical_reads    min_logical_writes   max_logical_writes   avg_logical_writes   min_worker_time      max_worker_time      avg_worker_time      min_elapsed_time     max_elapsed_time     avg_elapsed_time


    )select SalesOrderId , SalesOrderDetailId , OrderQty from Sales . SalesOrderDetail where LineTotal = @0 .......
    NULL
    <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.0"........
    1                    1241                 1241                 1241                 0                    0                    0                    240383               240383               240383               573199               573199               573199

    (1 row(s) affected)

     

    The obvious limitation of that procedure is that it can only pinpoint queries that still have its plan stored in SQL Server cache. I found that, specially for low-cost queries, SQL tends to throw away the plan as soon as it finishes executing it. Setting "PARAMETERIZATION FORCED" for the database helps to keep query plans in the cache for more time. In "real life" scenarios it seems to find the "villains" pretty often.

    I am not finished with it yet, as it is a nice excuse to learn XQuery in SQL Server and there are some quirks in the logic to extract the statement text (the open bracket at the beginning) that I want to fix. But I think it can be useful as it is, so I am making it public.

    I also wouldn't mind getting some feedback on them. :-)

  • SQLBits

    Yesterday I attended the SQL Server focused conference called SQLBits, which this year took place in Birmingham. First of all, many thanks to Tony Rogerson, Simon Sabin and all organisers. You did a hell of a good job!!!

    As a suggestion for the next ones, I would find useful to have an indication if it is a session intended for beginners or if it will be more in-depth sessions. I got a couple of sessions "wrong" because I misjudge it from the Session title and description.

    In this post and in the following I will post a summary of what I managed to grasp from the presentations I attended:

    1 - Administering SQL Server with PowerShell

    Richard Siddaway gave a quick but instructive overview on PowerShell with enough samples for us to get the "feel"of the language. He was also very honest and pointed a lot of quirks and pitfalls he (and the community) experienced.

    He also showed us the version of PowerShell included in SQL Server 2008. The integration of SQL Server in the shell is well done, navigating through the objects hierarchy using "DOS" commands like "cd" and "dir" is very cool :-). It also includes support in SQL Agent for PowerShell tasks.

    But I am not happy with the fact that the SQL Server team needed to make their own "SQL" PowerShell version. I mean, do we really need the hassle of testing our scripts and functions for several (potentially incompatibles) variations of the shell? There are times when Microsoft REALLY gives me the impression that their dev teams don't talk to each other.

    I looked at PowerShell last year but it did not impressed me at that time. I mean, it is more elegant than VBScript and I suppose it is a lot more powerful, but until it is more widely available I am not tempted to use it (the idea of having to convince security and admin guys to allow me to install it on every server does not entice me at all). But now that it is included in Windows 2008, the time for me to revisit it might be coming.

    The only thing that keeps bothering me is performance. It was painfully slow during the demo. I know, Richard had lots of VMs and other stuff running at the same time, but it didn't looked good. The fact that one of the recommended books showed a turtle in the cover did not cheered me up as well. :-)

     PowerShell

    2 - What are you waiting for?

    Christian Bolton's session was very interesting. I already use wait statistics as one of my tools to diagnose performance problems, and it has proved itself to be very effective time and time again.

    Nevertheless, as my bread and butter still is SQL 2000, I had not delved too much in the DMV's related to waits. Basically you will look at:

    sys.dm_os_waiting_tasks - exposes data from all tasks that are currently waiting for some resource. Much better than SQL Server 2000 sysprocesses lastwaittype and waittime columns

    sys.dm_os_wait_stats - It is the evolution of DBCC WAITSTATS. As it's predecessor, it accumulates data since last reboot. You can clear it by running DBCC SQLPERF ('sys.dm_os_wait_stats',clear), although I rather write procedures to calculate and show a delta.

    A couple of tools you might use to look at statistics was also briefly presented. I already use Performance Dashboard, tough I have not installed yet the DMVStats and Christian inspired me to give it a go. Maybe this could turn into a new blog post later?

    He talked about SQLNexus, but did not showed it working, but I confess I think it is a bit overkill and too complex for my taste. Now that unfortunately we lost Ken Henderson, who was deeply involved in it I decided to wait a bit to see how it develops before trying it again.

  • Data purity

    Reviewing DBCC CHECKDB syntax for SQL Server 2005, I found a [DATA_PURITY] option added to its syntax.

    This option enable the following checks on each column value of those datatypes:

    Unicode character - The data length should be a multiple of 2.

    Datetime - The days field should be between Jan 1 1753 and Dec 31 9999. The time field must be earlier than '11:59:59:999PM' .

    Real and Float - Check for existence of invalid floating point values like SNAN, QNAN, NINF, ND, PD, PINF.

    For a database created on SQL Server 2005, these checks are always performed, but upgraded databases won't do it until you explicitly call using that option. If it reports no errors, then all subsequent CHECKDB in that database will perform those checks automatically.

    Add this in your migration checklist.

    KB923247 has all the details about it. The guys at mssqltips.com also have a good article about it.

  • Losing your Transaction log

    Technorati Tags: ,

    Some days ago, my boss got on my desk asking if I could take a look at one database that was not opening. That DB was in a virtual machine used for our sales presentations.

    For some reason, several files got corrupted in that VM, and between those was our beloved transaction log file.

    It was a perfect opportunity to sharpen my database recovery skills, there was quite some time that I didn’t played with those tools and to be honest I had forgot almost everything.

    So, as I will probably forget about it soon (thankfully, SQL Server does not crash that often), and as there is quite some time that I don’t post anything here, I think it will be good to blog about it.

    The error recorded in the log was:

    Error: 9004
    An error occurred while processing the log for database AnyDB.

    I love clear messages like that. This helps a lot to identify what caused the problem, right? No other meaningful error on SQL Server Errorlog nor in EventViewer.

    A quick look at google confirmed my suspicion:

    Time to restore from our backups.

    To be honest, this is what I would do in any critical database, but where is the fun it that? :-)

    So I went for the not recommended neither supported procedure in SQL Server 2000, which is:

    1 - Enable updates to system tables

    EXEC sp_configure 'allow updates',1

    RECONFIGURE WITH OVERRIDE

     

    2 - Save the current status for your database

    SELECT dbid, name, status

    FROM sysdatabases

    WHERE dbname = 'test'

     

    3 - Set it to “Emergency” mode (Status = 32768)

    UPDATE sysdatabases

    SET status = 32768

    WHERE dbname = 'test'

     

    4 - Recreate an empty log

    DBCC REBUILD_LOG (test,'<PATH>\test.ldf')

     

    5 - Check if your database is physically ok

    ALTER DATABASE test SET SINGLE_USER

    DBCC CHECKDB (test)

     

    6 - Change your DB status to something “neutral”

    UPDATE sysdatabases

    SET status = 28

    WHERE dbname = 'test'

     

    7 - Disable updates to system tables

    EXEC sp_configure 'allow updates',0

    RECONFIGURE WITH OVERRIDE

     

    8 - Restart SQL Server

     

    In SQL Server 2005, the procedure is much easier, and tough obviously still not recommended, it appears to use only supported commands:

    ALTER DATABASE test SET SINGLE_USER, EMERGENCY

    DBCC CHECKDB (test, REPAIR_ALLOW_DATA_LOSS)

    ALTER DATABASE test SET MULTI_USER, ONLINE

    Nice, huh? Obviously, both procedures will leave your database in a not-so-good shape because losing your transaction log, means losing the ability to undo uncommitted work and redo committed work that was not yet copied to your data file. So you should not trust anymore the integrity of you data.

    Even so, I believe that there are times when this is worthwhile. If you don’t have a good backup and/or you are dealing with development databases this might be a useful last resort.

    For more details, see Paul Randal's blog (always worth reading).

  • The sound of silence

    This one isn't related to SQL Server. But do you know when sometimes you get stuck in an almost infinite loop of dialog boxes and for some reason you don't want to kill the application?

    "Sorry <beep> guys, this <beep> should <beep> end in <beep> any mome<beep>nt now <beep>."

    Well it happened to me these days and I decided that enough was enough and I would request the help of Saint Google!

    It turns out  that at least on XP and Vista there is a nice registry key at: [HKEY_CURRENT_USER\Control Panel\Sound] called, have a guess: "beep". Change it to "no" and logoff / logon.

    No more awkward noises since then....

    Technorati Tags:
    Posted Sep 06 2007, 11:14 PM by leo.pasta with no comments
    Filed under:
  • Page splits

    These days, investigating a fairly simple insert which was taking longer than what I would consider reasonable for that particular piece of code, I explored several possible causes. One thing that raised my attention was that the "Pages split/sec" perfmon counter was constantly above 0, so I began to wonder which index was causing most of it. I ended up reading an interesting blog entry from Greg Linwood which inspired me to write a small stored procedure to monitor it for a longer period:

    USE master
    IF OBJECT_ID('dbo.sp_dba_CheckPageSplits') IS NOT NULL
      DROP PROC dbo.sp_dba_CheckPageSplits
    GO
    CREATE PROCEDURE dbo.sp_dba_CheckPageSplits (@MaxLSN varchar(255)= NULL OUTPUT)
    AS
    /*
    Author: Leonardo Pasta
    Date: 13/08/2007
    Summary: List the objects that are suffering most from PAGE SPLITS.
             This procedure is compatible with SQL 2000. To use it on SQL2005, replace "[Object Name], [Index Name]" with "[AllocUnitName]"
    */
      SET NOCOUNT ON
      IF @MaxLSN IS NULL
        SET @MaxLSN = ''

      RAISERROR('Verifying logs greater than %s',1,1,@MaxLSN) WITH NOWAIT
      SELECT TOP 15 [Object Name], [Index Name], Context, count([Current LSN]) [Splits]
      FROM ::fn_dblog(default,default)
      WHERE Operation = N'LOP_DELETE_SPLIT'
        AND [Current LSN] > @MaxLSN
      GROUP BY [Object Name], [Index Name], Context
      ORDER BY [Splits] DESC

      SELECT @MaxLSN=MAX([Current LSN])
      FROM ::fn_dblog(default,default)
    GO

    Then ran something like the infinite loop below to get a quick & dirt report of the worst offender indexes:

    DECLARE @MaxLSN varchar(255)
    WHILE 1=1
    BEGIN
      EXEC dbo.sp_dba_CheckPageSplits @MaxLSN OUTPUT
      WAITFOR DELAY '00:00:30'
    END

    To be honest, I don't believe that my IO problems will be explained by page splits, but I think this will be a valuable tool to fine-tuning our indexes strategies in the long run.

Powered by Community Server (Commercial Edition), by Telligent Systems