March 2008 - Posts

In a recent post SQL Server 2008 - HierarchyId whats the point? I talk about the scenarios that the new hieararchyId can be applied to. Here we expand on this further and also provide a solution for moving hierarchies around.

The limitation is that you can't do something like this to change the parent of a node.

declare @newParent = (select node from Org where OrgId = 1234)

update Org

   set node.parent = @newparent

 where OrdId = 4567

Similarly if you want to move all the children nodes of one node to another node you can't in a simple way.

The reason is that the hierarchyId is a path enumeration model and if node 457 has children the children have the path to node 4567 embedded in them, i.e. they reference the 4567 node by path.

Look at this situation with nodes "/", "/1/" and "/1/1/", "/1/1/1/, "/2/" and "/2/1/", "/2/1/1", "/2/2/" and "/2/2/1" represent the root and two sets of a grand parent, parent and child nodes. See diagram.

We want to move the sub tree of  "/1/1/" to the new parent "/2/", by subtree I mean all the nodes below and including "/1/1/". Those with bold outlines below

If we change the parent of "/1/1/" to "/2/" using the reparent function then we will have a conflict in our hierarchy. As there will now be two nodes with the same path.

Whats more if we don't change the path of the nodes below "/1/1/"  then these nodes will reference an invalid path (an ancestor will be missing).

So we have to reparent all the nodes in the subtree. Unfortunately that makes the issue of nodes with the same path even worse. (See nodes with a red outline)

 

So what we have to do is find the last child of our new parent, in this example "/2/2" in order that we can move the sub tree to a new path that doesn't conflict with an existing one, "/2/3/"

As we are maintaining the subtree below the node we are moving we have to reparent the child nodes using this new parent record.

Nicely the reparent function on the hierarchyId type if you specify a old parent node that is itself it will just return the new parent i.e. just changing its own path. (Note the reparent is not a mutator so you have to assign the result of the reparent function). Reparent unlike its name suggests doesn't do anything but return a node with a path based on the inputs.

So thats the logic so how do you go about doing it in code. The following includes a few more nodes to show moving multiple trees in one go.To setup the tables and data use the script at the end.

Firstly lets find the maximum for the child we want to move 

declare @NewParent hierarchyId = '/3/'

declare @OldParent hierarchyId = '/1/'

 

--Find the max child node that belongs to the new parent node. The nodes being moved will be

--appended after those that already belong to the new parent

declare @Maxchild hierarchyId = (select max(node)

                                   from HTest

                                  where @NewParent.IsDescendant(node) = 1

                                    and node.GetLevel() = @NewParent.GetLevel() + 1)

You could use GetAncestor which is equivalent to the code above, I think the above is more readble.

Unfortunately there is no method to return the child index of a node, the reason we need this is that we are going to build the path directly and so need the child index so we can add values to it.

declare @ChildCount int = (select substring(left(@MaxChild.ToString(),LEN(@MaxChild.ToString()) - charindex('/',reverse(@MaxChild.ToString()),2) ),2,10))

So now the crunch, what we need to do is find all the nodes we are going to move. We will be moving two types of node. Top level nodes and child nodes, it is the top level nodes we that need to have there position calculated. The position of the children don't change. If in the example above the child "/1/1/1/" is still the 1st child of its parent the only difference is its parent is now "/2/3/".

We therefore need to calculate the position for these parents. This can be used by using a window aggregate. The dense_rank() allows us to return the same position for the parent as well as the children. To do this for the children we use the GetAncestor and GetLevel methods to find the parent node that is moving (it actually works for the parent as well so we can use the same code in the order by). This dense_rank returns an incrementing value starting at 1. We can then add that to the max child position calculated above. In conjunction with the new parent path we have the new path for the top nodes we are moving.

We can then use Reparent method to return the new paths of the nodes.

The following is a SELECT statement that returns the reparented nodes so you can see what is going on.

--Query to calculate the new path by using the reparent function along with a calculation to work out the correct position in

--the children of the new parent.

select  node.Reparent(oldNodeParent, @NewParent.ToString() + cast(@ChildCount + NewChildPos as varchar(10)) + '/').ToString(), *

from (

select node

     , dense_rank() over (order by node.GetAncestor(node.GetLevel() - @OldParent.GetLevel())) NewChildPos

     , node.GetAncestor(node.GetLevel() - @OldParent.GetLevel()).ToString() oldNodeParent

     , value, node.ToString() nodeText, node.GetAncestor(1).ToString() Parent -- these are for debugging aren't need to calculate the new path.

from HTest

where value like '1.1.%'

or value like '1.2.%') d

This shows how the above can be used in an update statement.

update HTest

  set node = d.node.Reparent(oldNodeParent, @NewParent.ToString() + cast(@ChildCount + NewChildPos as varchar(10)) + '/').ToString()

from (

select node

     , dense_rank() over (order by node.GetAncestor(node.GetLevel() - @OldParent.GetLevel())) NewChildPos

     , node.GetAncestor(node.GetLevel() - @OldParent.GetLevel()).ToString() oldNodeParent

from HTest

--This is the criteria for selecting the nodes to move.

where value like '1.1.%'

or value like '1.2.%'

) d

where d.node = HTest.node

To setup the tables and data use the following script

set nocount on

go

use test

go

drop table Htest

go

create table HTest (node hierarchyId, value varchar(10))

go

create unique index IX_HTest_node on HTest(Node)

go

declare @h1     hierarchyId = HierarchyID::GetRoot()

declare @h1_1   hierarchyId = @h1.GetDescendant(null,null)

declare @h1_2   hierarchyId = @h1.GetDescendant(@h1_1,null)

declare @h1_3   hierarchyId = @h1.GetDescendant(@h1_2,null)

declare @h1_4   hierarchyId = @h1.GetDescendant(@h1_3,null)

declare @h1_5   hierarchyId = @h1.GetDescendant(@h1_4,null)

declare @h1_1_1 hierarchyId = @h1_1.GetDescendant(null,null)

declare @h1_1_2 hierarchyId = @h1_1.GetDescendant(@h1_1_1,null)

declare @h1_1_2_1 hierarchyId = @h1_1_2.GetDescendant(null,null)

declare @h1_1_3 hierarchyId = @h1_1.GetDescendant(@h1_1_2,null)

declare @h1_2_1 hierarchyId = @h1_2.GetDescendant(null,null)

declare @h1_2_2 hierarchyId = @h1_2.GetDescendant(@h1_2_1,null)

declare @h1_2_3 hierarchyId = @h1_2.GetDescendant(@h1_2_2,null)

declare @h1_3_1 hierarchyId = @h1_3.GetDescendant(null,null)

declare @h1_3_2 hierarchyId = @h1_3.GetDescendant(@h1_3_1,null)

declare @h1_3_3 hierarchyId = @h1_3.GetDescendant(@h1_3_2,null)

 

insert into HTest values(@h1    ,'1')

insert into HTest values(@h1_1  ,'1.1')

insert into HTest values(@h1_2  ,'1.2')

insert into HTest values(@h1_3  ,'1.3')

insert into HTest values(@h1_4  ,'1.4')

insert into HTest values(@h1_5  ,'1.5')

insert into HTest values(@h1_1_1,'1.1.1')

insert into HTest values(@h1_1_2,'1.1.2')

insert into HTest values(@h1_1_2_1,'1.1.2.1')

insert into HTest values(@h1_1_3,'1.1.3')

insert into HTest values(@h1_2_1,'1.2.1')

insert into HTest values(@h1_2_2,'1.2.2')

insert into HTest values(@h1_2_3,'1.2.3')

insert into HTest values(@h1_3_1,'1.3.1')

insert into HTest values(@h1_3_2,'1.3.2')

insert into HTest values(@h1_3_3,'1.3.3')

 

go

select *, node.ToString() from HTest

 

 

 

 



-

I've recently been deliverying training sessions on SQL Server 2008 and doing launch events. The training covers all aspects of SQL Server development.

For most features I can provide a strong argument for the feature with the exception of one. HierarchyId.

HierarchyId is the new data type that allows you to store a hierarchy in a single column. This is using a path enumeration model, just like the path to a file on the file system.

The benefit of HierarchyId is that the storage structure is very effiecient it uses a thing called Ord Path.

With this efficient data type querying for the children of a parent performs very well.

However the one thing that just baffles me is the amendment of hierarchies and moving of nodes in a hierarchy. It baffles me because this I foresee to be one of the major requirements for a hierarchical data, yet the data type doesn't really support it.

Ok so you will see a method Reparent(). However this doesn't do anything to cater for existing data in your hierarchy. Essentially all the methods for the data type know about  only the instance of the type (i.e. column value or variable) that the function i being used with. It doesn't do anything to check the other rows in the table.

You will find that if you want to add a node to a parent. You have to find the last child of that parent and then add the new node to the parent but after this last child. This isn't difficult but cubersome code that you will have to repeat again and again. A bit like the standard code you need for a Service Broker activation procedure.

Whats more none of the functions are set based and so if you want to add two children you have to add one at a time because to add the second you need to add it after the first.

So all in all I'm a bit disappointed by the HierarchyId type I do hope they make it more functional in future releases.

If youre interested in set based Hierarchy Id solutions I have a few but they aren;t pretty. I will blog about them over the next few days.

 

  



-
Posted by simonsabin | 1 comment(s)
Filed under: ,

We are venturing into afternoon usergroup meetings for thos of you that can't make it in the evenings. We are fortunate to kick off the first afternoon meeting with Rob Farley on the 7th April. Fortunate as he lives in Australia and is only in the UK for a short period of time.

Rob Farley, a mentor/trainer with Solid Quality Learning, has been consulting in IT since completing a Computer Science degree with first class honours in 1997. He lives in Adelaide, where he heads up the Adelaide SQL Server User Group and sits on the South Australian Branch Executive Committee of the Australian Computer Society. He holds several Microsoft certifications including MCSD, MCDBA, MCTS, MCPD and MCITP. He is a Microsoft Certified Trainer and is a recipient of the Microsoft MVP Award for SQL Server. He blogs at http://msmvps.com/blogs/robfarley

For more details of what Robs going to be covering go to http://sqlserverfaq.com/?eid=113 



-
Posted by simonsabin | 1 comment(s)
Filed under:

James Rowland Jones is kindly organising the next Usergroup meeting in the London on the 17th April.

For the event details go to http://sqlserverfaq.com/?eid=110



-
Posted by simonsabin | with no comments

I'm currently on the "I like LINQ to SQL" side of the fence. I think it is very much a marmite question, you love it our you hate it. But importantly you often hate it even if you haven't tried it.

I've been digging around with LINQ recently and whilst it is very nice and easy to do things. This, as with many situations, can cause problems if you don't know what is being done under the hood.

Mike Taulty is font of knowledge when it comes to LINQ and Entity Framework and he recently posted about the use of stored procedures in LINQ. (LINQ to SQL, Stored Procs, Output Params, Readers Reading)

One thing you find out is that LINQ may not execute a query on the line of code you think it will, i.e.

var query = ctx.GetCustomersByCountry("UK");

Will not execute the procedure associated with the GetCustomersByCountry method. It will only get executed when you do something with query.

Whats more when the procedure returns a resultset and you then iterate over that resultset the results are read from the server one at a time. This means that if you do complex processing in your loop you can tie up resources and a connection to your database.

For more details look at Mike's blog post (LINQ to SQL, Stored Procs, Output Params, Readers Reading)

Whilst this isn't as bad as serverside cursors in the old ADO days it is still a gotcha you have to be careful of.



-
Posted by simonsabin | 4 comment(s)
Filed under: ,

I recently presented on using CLR in SQL at DevWeek and commented on how TSQL Udfs just don't perform. A common scenario people in the audience were using TSQL Udfs for was splitting delimited strings.

One guy asked me afterwards what the best way to do it was, he had addresses in one field he wanted to split into x columns. On the spot I cam up with the following with uses a number of well know tehniques.

This uses the nuber table solution to split the delimited string into rows and then uses pivot to along with row_number to position the string tokens into columns. It also uses a cross join to have the process work for each row in the table.

Note: this is for a num table starting at 0.

declare @m table (id int, d varchar(100))

insert into @m values (1,'simon,sabin,sql')

insert into @m values (2,'steve,Jones,data')

 

 

select p.*

from (

select id

     , substring(d, start+2, endPos-Start-2) token

     , row_number() over(partition by id

                         order by start) n

      from (

select id

     , d

     , n start

     , charindex(',',d,n+2) endPos

               from num

        cross join (select id

                         , ',' + d +',' d

                      from @m) m

              where n < len(d)-1

                and substring(d,n+1,1) = ',')d

      ) pvt

Pivot ( max(token)for n in ([1],[2],[3],[4]))p

The great thing here is it works for multiple rows in a table and so you can replace @m with your table of choice.

You have fixed the number of columns so if you want this to be dynamic then its more complex..

I was going to wait on posting this but thought it would follow on well from Madhivanan's post http://sqlblogcasts.com/blogs/madhivanan/archive/2008/03/15/splitting-csv-to-columns.aspx



-

Its all about Sparse columns and they are a great feature of Katmai

Imagine you have a products table, you will have attributes for some products that don't apply to others i.e. cup size applies to bras but not to socks.

There are a number of ways of modelling this each with downsides. With sparse columns you can have a column per attribute per product. The benefit is that you are following proper modelling by not mixing meaning of a column and you have strongly type data. The other aspect is that you save space because sparse columns are stored in a similar fashion to variable length fields. Thus if the value is NULL you will save space especially on fixed type columns.

I can imagine the select and insert limits haven't been increased because you also have the ability of having a column set which combines all the sparse columns into an XML document containing them. You are abe to select this (if you do a select * and you have this you will only get this not the individual columns) but you are also able to update and insert through it.

Aaron recently posted on the increase in the column limits, 30,000 sparse columns are now possible. (http://sqlblog.com/blogs/aaron_bertrand/archive/2008/03/11/increased-limits-for-columns-indexes-statistics.aspx) He also asks why. Well this is it.



-

One of the frequently asked questions in the newsgroups is how do you convert a string representation of a binary value to the equivalent binary value and the reverse.

The latter is often required to be able to print out the binary representation.

There have been a few undocumented functions and many home brew solutions.

In SQL Server 2008 CONVERT can now do it for you. Thanks to Itzik for pointing this out to me last night.

So you can now do the following

select convert(varchar(8), 0x023454, 1)

select convert(binary(3), '0x023454', 1)

Which is great.



-
Posted by simonsabin | with no comments

Given my previous post on how the SQL Server tools suck its only fair for me to point out what has been added/improved in SQL Server 2008

Thanks to Buck for a nice concise list,

Title

Component

Advantage

What's the big deal?

Connection String Additions

Connection Panel

User Productivity

Users can enter additional connection strings, which enables full use of Analysis Services. Previously, there was no way to connect with special behavior to AS.

Data Collector

Framework

Increased management information

Allows automatic, declarative collection of trace data, performance counters or transact-sql queries into a single database collection system.

Installation Enhancements

Installation

Increased speed in install

Re-worked UI to enable faster installations. Also, SSMS Express is now a combined codeset with SSMS (although with less features), allowing a consistent management experience. SSMS can now also be shipped "out of band", meaning that releases of CE can be combined with SSMS so that you don't have to wait on a service pack to connect to CE.

Policy Based Management (formerly DMF)

Management Framework

Pro-active management

Policy-based framework for managing, controlling and monitoring a SQL Server instance.

Resource Governor

Management Framework

Greater control of system resources

Enables you to specify limits on the amount of CPU and memory that incoming application requests can use.

PowerShell integrated to work with SQL Server

PowerShell

Total and complete scripting control

PowerShell, which will ship with SQL Server, is now aware of SQL Server 2008 objects. You have the full scripting constructs of PowerShell against SQL Server Objects. Navigate the SQL Server instance using "CD" commands.

Create a Data Collector Set (for Data Collector) from a Profiler Trace

Profiler

Use the same tool to monitor as to control

You can use a Profiler trace definition to create a data collector set, which allows you to find expensive queries and control them.

Profiler Improvements

Profiler

Profiler launchable from SSMS.

Profiler can be launched directly from a Query Editor Results window to the SPID of the query.

Multi-Server Query

Query Editor

Run a query against multiple servers

Run a single query across multiple server. Results have the server name pre-pended to them. Results show in a color-customizable result set.

Intellisense

Query Editor

Auto-fill for keywords and objects, prevents mistakes, increases speed

Shows proper syntax and object names automatically.

Customizable tabs in Query Editor

Query Editor

Better navigation

You can set the tab text at the top of the results pane.

Launch Profiler from Query Editor

Query Editor

Quick access to activity information

You can launch Profiler directly from a Query Editor Results window to the SPID of the query.

Error List in Query Editor

Query Editor

Recall last errors in T-SQL

Shows a list of the last errors generated by Query Editor

New Right-Click options on Results Grid

Query Results Window

Easier Export of data

Users can now select a range of rows in the Results Grid and copy the headings as well. Also, you can select individual cells and copy them with headers if you wish.

Framework Improvements for SSMS

SQL Server Management Studio

Increased Performance

Multiple internal code cleanups and fixes to improve the load time of the Object Explorer tree to over 20 times it's current performance. Also takes much less memory per object to load.

Object Search

SQL Server Management Studio

Quickly locate and work on SQL Server objects

Using PBM (DMF) under the covers, the user enters a full or partial string to match, which is searched within the current database to bring objects into Object Explorer Details.

Speed Enhancements

SSMS

Increased performance

Multiple fixes and framework changes to enable faster loading, and faster population. Connection consolidations where possible to take a lighter load on the server.

Activity Monitor built for the DBA

SSMS

Quickly diagnose performance data with little overhead

The new Activity Monitor was written from the ground up with the perspective of the DBA needing to chase down a performance problem in real time. Modeled after the new Windows Resource Monitor, DBAs can quickly see the active sessions, wait states, file I/O, and long running queries in a command console like UI.

UI Framework Enhancements

SSMS

Better feedback and navigation

A new "info bar" appears in many screens that guide the user with pre-validation (saving time and frustration) and visual navigation cues.

New Object Explorer Details Pane

SSMS

Quick access to customizable information

The header columns for each object can be selected. Selections are saved. Single-click on object instantly returns property information in a sizable panel at the bottom of SSMS like Vista.

Security - Admin can reset password

SSMS

Easier to manage security

The Security Admin can now reset a password for a user from SSMS.

Security - More information on effective permissions

SSMS

Easier to determine permissions

Enhanced the "effective permissions" display to show exactly what a principal has access to.

Security - reduced clicks for permissions

SSMS

Faster implementation of security

Reworked the screens to make it easier and faster to set permissions

"Open Table" enhancements

SSMS

Takes a lighter load on the server

A limited amount of rows is returned to the "Open Table" dialog to avoid taking locks and load on server.

Database Diagram/Table Designer safety additions

SSMS

Prevents accidental operations

The Table Designer and Database Designer tools have been reworked so that safety checks are taken for operations that would drop an object or cause data loss.

Activity Monitor can launch Profiler

SSMS

Instant access to activity information

Activity Monitor can launch SQL Server Profiler for the SPID it is centered on.

Showplan - Extract Query from graphical plan

SSMS

Quickly show a query from a graphical display

You can right-click a graphical showplan and extract the SQL query that generated it.

Showplan - Extract XML from graphic

SSMS

Quickly access query breakdown

You can right-click a showplan graphic and see the XML version of the plan.

Showplan - XML is now formatted

SSMS

Improved visualization

The XML created by the showplan is now formatted instead of showing as a large string.

The sys.dm_exec_query_plan XML output can generate graphic

SSMS

You can go from the xml to the graphical plan in a single click

The output from the sys.dm_exec_query_plan stored procedure includes an XML column of the query plan. Clicking it now generates a graphical display of the query.

Performance Studio

SSMS

Complete performance tuning information with navigable reports

Tracks historical performance metrics from SQL Server and the operating system, stores them in a repository, using drill-through reports.

Configuration Servers

SSMS

Register servers at a central location

Enables teams to register servers in a single location, once.

Add GUI for Table Partitioning Setup

SSMS

Easier to set up table partitioning

You can now set up Table Partitioning (which previously required a lot of typing) with SSMS. Includes a Manage Partition wizard to implement the “sliding window” scenario for managing partitioned tables, where for every unit of time (could be month/week/quarter), you bring in a new partition for the latest data and slide out the oldest partition for archival purposes.

Service Broker UI enhancements

SSMS

Easier setup of Service Broker

Hooked in the creation/alter templates with the context menu in the object-explorer. We have also added read-only property pages for each of these object types.

Transact SQL Debugger integrated in SSMS

It’s about time!



-

When I moved from Oracle 7 to SQL Server 6.5 I was amazed at the tools you got with SQL Server. They made the product so much easier to manage and pickup. Things like enterprise manager, profiler and query plans where amazing.

That was almost 10 years ago, and whats changed. Well very little.

So the tools now support the new engine features, and they look different but in essence there is much that has improved since the 6.5 days.

There's no schema designer, there are no nice visualisers of database performance, no can't drag a table from one database to another, or change the data type of a column and have it propogate through the schema.

Ok so these aren't simple things to solve, but thats not my problem, I want tools that make my life easier. Management studio is just a cataloging system. It doesn't really help me do my job.

I think Data Dude has gone some way, the error checking (compile like) feature is great as is the way things are structured.

Whats more many will say that there are lots of little bugs/features that are sitting in management studio that just aren't getting fixed.

So what can be done. Well ove the past months I've seen a huge drive my the manageability team to improve the lot of Management Studio. However what I find is that its all tied to the major release of Katmai and so you have to wait for 3 years before you get something. In fact its going to be more like 5 years, i.e. you have the idea now, it won't get into Katmai so will be in katmai +1 which will be in 4-5 years time.

4-5 years time, my kids will be in school by then, I will have probably changed jobs, probably now working on something different and won't care.

What I want is to be able to benefit from the drive of the tools guys and see the benefits realised sooner. This is only going to happen if the tools are not tied to the release of the engine. One argument is to include new features in service packs. Well anyone that manages many servers doesn't want new features they have to test they just want the minimal changes for bugs.

So if you support my idea of splitting out the release of tools from the release of the engine please vote.

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=331608

 



-
Posted by simonsabin | 8 comment(s)