Workspace Memory / Query Memory Tuning – RESOURCE_SEMAPHORE / IO_COMPLETION / SLEEP_TASK Waits

 

SQL Server is configured to use a lot of memory, but my query is slow and not using all the memory available and it is spilling the sort or the hash match operation to tempdb, how can you tune the configuration and the query?

 

Memory allocating queries request memory based on estimation (ideal memory) and query memory (workspace memory) availability, when they don’t get the right amount of memory they spill to tempdb and lead to performance issues. Previous articles discussed ways to make the estimation better; this article discusses ways to address query memory availability.

 

You can find the ideal amount of memory a query needs (when the query is executing) using sys.dm_exec_query_memory_grants. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union.

 

select * from sys.dm_exec_query_memory_grants
go

 

Column ideal_memory_kb indicates the ideal amount of memory the query needs. This is based on estimation, this might be incorrect for various reasons including out of date statistics, in some cases (more common that you might think) under estimation of memory by the optimizer even when the statistics are up to date (refer to article http://jahaines.blogspot.com/2010/03/performance-tuning-101-what-you-will.html) and due to plan caching (refer to article Plan Caching and Query Memory). Some might recommend tweaking the index/column statistics in an undocumented way to inflate the estimation for additional memory, I suggest following the MSSQL Tip http://www.mssqltips.com/sqlservertip/1955 and webcast 1 and 2 at www.sqlworkshops.com/webcasts to understand the issue and the recommendations.

 

To read additional articles I wrote click here.

 

The ideal amount of memory a query needs is based on estimation. Requested memory is based on ideal memory and maximum available workspace memory. There is a possibility the estimation is correct (let’s say you have up to date statistics and/or fixed the under estimation issues by following the above articles) but the available workspace memory is not enough and hence ideal amount is low. The next question is: what is available workspace memory.

 

Available workspace memory, also known as query memory, is the amount of memory available for common memory allocating queries that perform Sort and do Hash Match operations. This is automatically calculated based on your system configuration. You can monitor the currently available workspace memory and maximum workspace memory by executing the command dbcc memorystatus and looking for ‘Available’ and ‘Current Max’ under ‘Query Memory Objects (default)’ and ‘Small Query Memory Objects (default)’. Or using Performance Monitor counters ‘Granted Workspace Memory (KB)’ and ‘Maximum Workspace Memory (KB)’ under object Memory Manager. ‘Maximum Workspace Memory (KB)’ is the sum of ‘Current Max’ of both ‘Query Memory Objects (default)’ and ‘Small Query Memory Objects (default)’, note ‘Current Max’ is in pages (* 8 = KB).

 

Here is a query to find the ‘Maximum Workspace Memory (KB)’ using sys.dm_os_performance_counters, ‘Maximum Workspace Memory (KB)’ can be up to 75% of ‘Target Server Memory (KB)’.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

--Example provided by www.sqlworkshops.com
select (select cntr_value
        from sys.dm_os_performance_counters
        where object_name like '%Memory Manager%' and counter_name like 'Maximum Workspace Memory (KB)%') as 'Maximum Workspace Memory (KB)',
    (select cntr_value
        from sys.dm_os_performance_counters
        where object_name like '%Memory Manager%' and counter_name like 'Target Server Memory (KB)%') as 'Target Server Memory (KB)',
   (select cntr_value
        from sys.dm_os_performance_counters
        where object_name like '%Memory Manager%' and counter_name like 'Maximum Workspace Memory (KB)%') * 100.0 /
    (select cntr_value
        from sys.dm_os_performance_counters
        where object_name like '%Memory Manager%' and counter_name like 'Target Server Memory (KB)%')  as Ratio
go

 

In my server, when Target Server Memory is 4096MB, Maximum Workspace Memory is 3077MB, which is about 75% of Target Server Memory.

 

By default a query will not request more than 25% of this Maximum Workspace Memory in SQL Server 2008 and above and this Memory Grant 25% can be changed using Resource Governor Workload Group settings. With SQL Server 2005 and below, this is 20% and cannot be changed without the support of Microsoft (with a combination of undocumented trace flag and changes to the configuration parameters).

 

Why should you care about 25%? You might have a customer executing a heavy reporting query in the night and they want to use more than 25% of Maximum Workspace Memory for that single query. Let’s assume the customer configured 4096MB for their SQL Server instance (the Target Server Memory might be less than the configured memory if there is memory pressure on the server), let's say their workspace memory is 3077MB (‘Query Memory Objects (default)’->‘Current Max’ = 381175 pages = 3049400KB + ‘Small Query Memory Objects (default)’ ->‘Current Max’ = 12800 pages = 102400KB; 3049400KB + 102400KB = 3077MB. And this customer’s query executing the report is limited to a maximum of 3049400KB / 4 = 744MB.

 

Let’s say the customer’s query executing the report needs 1,470MB of query memory (ideal memory) in order not to spill the sort to tempdb, but the query will request only 744MB. In case the customer can set the ‘request_max_memory_grant_percent’ (Memory Grant %) of the Workload Group setting to 50%, then the query can request up to 1,488MB and in this case the query will request the ideal memory it needs which is 1,470MB.

 

SQL Server has memory grant queues based on cost, if queries need more memory and there is not enough memory available in the queue, then the query will wait, you can get additional details using the command dbcc memorystatus. If a query requests huge amount of memory, but doesn’t utilize it (due to over estimation), this memory will be reserved and cannot be used by other queries, in some cases this will lead to unnecessary memory grant waits. So one has to be very careful not to overestimate (also described in the webcast www.sqlworkshops.com/webcasts) too much memory as it will affect concurrency. When the query waits for memory, the wait type will be ‘RESOURCE_SEMAPHORE’.

 

Let's set 'max server memory (MB)' to 4GB.

 

exec sp_configure 'max server memory (MB)', 4096
go
reconfigure
go

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

Scenario 1:
This query will underestimate memory due to optimizer issues and will also request less memory (744MB) due to the 25% Resource Governor Workload Group Memory Grant setting. The query will be slow spilling the sort to tempdb.

 

When the sort spills to tempdb the wait type will be ‘IO_COMPLETION’ but when the Hash Match operation spills to tempdb the wait type will be ‘SLEEP_TASK’.

 

We are using option (maxdop 1) to disable parallelism, to learn more about monitoring and tuning parallel query execution, refer to the webcast www.sqlworkshops.com/webcasts and article Parallel Sort and Merge Join – Watch out for unpredictability in performance.

 

set statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (maxdop 1)
go

 

Scenario 2:
This query will have better memory estimation due to the option clause with optimize for hint, but will still request less memory (744MB) due to the 25% Resource Governor Workload Group Memory Grant setting. The query will also be slow spilling the sort to tempdb.

 

set statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

go

 

Scenario 3:
This query will have better memory estimation due to the option clause with optimize for hint and will also request enough memory (1,470MB) due to the 50% Resource Governor Workload Group Memory Grant setting. The query will be fast with no spilling of sort to tempdb.

 

Please read the entire article and answer the challenge posted at the end of this article before changing the Resource Governor Workload Group Memory Grant setting in your production server as there are major disadvantages of changing the Resource Governor Workload Group Memory Grant setting, don't do it without understanding the full picture.

 

alter workload group [default] with(request_max_memory_grant_percent=50)
go
alter
resource governor reconfigure
goset statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

go

 

Scenario 4:
Let's execute 2 of this query concurrently in 2 sessions, with 25% Resource Governor Workload Group Memory Grant setting.

 

alter workload group [default] with(request_max_memory_grant_percent=25)
go
alter
resource governor reconfigure
go

 

Session 1:
Let's execute this query in a loop and measure the performance of the query executed in session 2. This query will have better memory estimation due to the option clause with optimize for hint, but will still request less memory (744MB) due to the 25% Resource Governor Workload Group Memory Grant setting. The query will be spilling the sort to tempdb.

 

while 1=1
begin
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

end
go

 

Session 2:
Let's execute this query few times. This query will have better memory estimation due to the option clause with optimize for hint, but will still request less memory (744MB) due to the 25% Resource Governor Workload Group Memory Grant setting. The query will be slow spilling the sort to tempdb. This query might take up to twice the amount of time to complete compared to Scenario 2, because 2 queries from 2 sessions are concurrently spilling to tempdb. There is no wait for memory grants (no RESOURCE_SEMAPHORE waits).

 

set statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

go

 

Scenario 5:
Let's execute 2 of this query concurrently in 2 sessions, with 50% Resource Governor Workload Group Memory Grant setting.

 

alter workload group [default] with(request_max_memory_grant_percent=50)
go
alter
resource governor reconfigure
go

 

Session 1:
Let's execute this query in a loop and measure the performance of the query executed in session 2. This query will have better memory estimation due to the option clause with optimize for hint and will also request enough memory (1,470MB) due to the 50% Resource Governor Workload Group Memory Grant setting. The query will have no spilling of sort to tempdb.

 

while 1=1
begin
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

end
go

 

Session 2:
Let's execute this query few times. This query will have better memory estimation due to the option clause with optimize for hint and will also request enough memory (1,470MB) due to the 50% Resource Governor Workload Group Memory Grant setting. The query will be fast with no spilling of sort to tempdb. But this query might take up to twice the amount of time to complete compared to Scenario 3, because 2 queries from 2 sessions are concurrently requesting nearly 50% of workspace memory. There is wait for memory grants (RESOURCE_SEMAPHORE waits), SQL Server grants memory to one query at a time as the memory in the grant queue is not enough to grant simultaneously the requested memory to both queries.

 

set statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 500000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 600000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

go

 

Overall performance of scenario 5 is better than scenario 4 even though there was memory grant waits. In this case waiting for memory is far worse than spilling the sort to tempdb.

 

Scenario 6:
Let's execute two queries with less memory requirement concurrently in two sessions (so memory can be granted for both queries simultaneously from the same memory grant queue), with 50% Resource Governor Workload Group Memory Grant setting.

 

Session 1:
Let's execute this query in a loop and measure the performance of the query executed in session 2. This query will have better memory estimation due to the option clause with optimize for hint and will also request enough memory (1,102MB) due to the 50% Resource Governor Workload Group Memory Grant setting. The query will have no spilling of sort to tempdb.

while 1=1


begin
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 400000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 450000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

end
go

 

Session 2:
Let's execute this query few times. This query will have better memory estimation due to the option clause with optimize for hint and will also request enough memory (1,102MB) due to the 50% Resource Governor Workload Group Memory Grant setting. The query will be fast with no spilling of sort to tempdb. There will be no wait for memory grants (RESOURCE_SEMAPHORE waits), SQL Server grants memory to both queries simultaneously as the memory in the grant queue is enough to grant the requested memory to both queries.

 

set statistics time on
go
--Example provided by www.sqlworkshops.com
declare @c1 int, @c2 int, @c3 char(2000)
declare @i int
set @i = 400000
select @c1 = c1, @c2 = c2, @c3 = c3
    from tab7
    where c1 < @i
    order by c2
    option (optimize for (@i = 450000), maxdop 1)
-- Option optimize for is used to inflate memory request
-- without this the query will spill the sort to tempdb due
-- to query optimizer under estimation of memory.
-- One possibility is to inflate the number of rows,
-- the other possibility is to inflate the row size
-- For more information refer to webcasts 1 & 2
-- at www.sqlworkshops.com/webcasts.

go

 

Challenge:
There are major disadvantages of changing the Resource Governor Workload Group Memory Grant setting, so don't make any changes without understanding the full picture. I will cover this is the next article, meanwhile if you can guess some reasons, write to me: Contacts.

 

I explain some of these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 Performance Monitoring & Tuning Hands-on Workshop in Vienna, Austria during November 15-17, 2011, click here to register / Microsoft Austria TechNet or in Oslo, Norway during January 17-19, 2012, click here to register. These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

R Meyyappan rmeyyappan@sqlworkshops.com

LinkedIn: http://at.linkedin.com/in/rmeyyappan

 

Plan Caching and Query Memory Part II (Hash Match) – When not to use stored procedure - Most common performance mistake SQL Server developers make.

SQL Server estimates Memory requirement at compile time, when stored procedure or other plan caching mechanisms like sp_executesql or prepared statement are used, the memory requirement is estimated based on first set of execution parameters. This is a common reason for spill over tempdb and hence poor performance. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. This article covers Hash Match operations with examples. It is recommended to read Plan Caching and Query Memory Part I before this article which covers an introduction and Query memory for Sort. In most cases it is cheaper to pay for the compilation cost of dynamic queries than huge cost for spill over tempdb, unless memory requirement for a query does not change significantly based on predicates.

 

This article covers underestimation / overestimation of memory for Hash Match operation. Plan Caching and Query Memory Part I covers underestimation / overestimation for Sort. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.

 

To read additional articles I wrote click here.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Most of these concepts are also covered in our webcasts: www.sqlworkshops.com/webcasts

 

Let’s create a Customer’s State table that has 99% of customers in NY and the rest 1% in WA.Customers table used in Part I of this article is also used here.To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'.

--Example provided by www.sqlworkshops.com

drop table CustomersState

go

create table CustomersState (CustomerID int primary key, Address char(200), State char(2))

go

insert into CustomersState (CustomerID, Address) select CustomerID, 'Address' from Customers

update CustomersState set State = 'NY' where CustomerID % 100 != 1

update CustomersState set State = 'WA' where CustomerID % 100 = 1

go

update statistics CustomersState with fullscan

go

 

 Let’s create a stored procedure that joins customers with CustomersState table with a predicate on State.

--Example provided by www.sqlworkshops.com

create proc CustomersByState @State char(2) as

begin

declare @CustomerID int

select @CustomerID = e.CustomerID from Customers e

inner join CustomersState es on (e.CustomerID = es.CustomerID)

where es.State = @State

option (maxdop 1)

end

go

 

Let’s execute the stored procedure first with parameter value ‘WA’ – which will select 1% of data.

set statistics time on

go

--Example provided by www.sqlworkshops.com

exec CustomersByState 'WA'

go

The stored procedure took 294 ms to complete.

 

The stored procedure was granted 6704 KB based on 8000 rows being estimated.

 

The estimated number of rows, 8000 is similar to actual number of rows 8000 and hence the memory estimation should be ok.

 

There was no Hash Warning in SQL Profiler. To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'.

 

Now let’s execute the stored procedure with parameter value ‘NY’ – which will select 99% of data.

-Example provided by www.sqlworkshops.com

exec CustomersByState 'NY'

go

 

The stored procedure took 2922 ms to complete.

 

The stored procedure was granted 6704 KB based on 8000 rows being estimated.

 

 

The estimated number of rows, 8000 is way different from the actual number of rows 792000 because the estimation is based on the first set of parameter value supplied to the stored procedure which is ‘WA’ in our case. This underestimation will lead to spill over tempdb, resulting in poor performance.

 

There was Hash Warning (Recursion) in SQL Profiler. To observe Hash Warning, enable 'Hash Warning' in SQL Profiler under Events 'Errors and Warnings'.

 

Let’s recompile the stored procedure and then let’s first execute the stored procedure with parameter value ‘NY’.

 

In a production instance it is not advisable to use sp_recompile instead one should use DBCC FREEPROCCACHE (plan_handle). This is due to locking issues involved with sp_recompile, refer to our webcasts, www.sqlworkshops.com/webcasts for further details.

 

exec sp_recompile CustomersByState

go

--Example provided by www.sqlworkshops.com

exec CustomersByState 'NY'

go

 

Now the stored procedure took only 1046 ms instead of 2922 ms.

 

The stored procedure was granted 146752 KB of memory.

The estimated number of rows, 792000 is similar to actual number of rows of 792000. Better performance of this stored procedure execution is due to better estimation of memory and avoiding spill over tempdb.

 

There was no Hash Warning in SQL Profiler.

 

Now let’s execute the stored procedure with parameter value ‘WA’.

--Example provided by www.sqlworkshops.com

exec CustomersByState 'WA'

go

 

The stored procedure took 351 ms to complete, higher than the previous execution time of 294 ms.

 

 

This stored procedure was granted more memory (146752 KB) than necessary (6704 KB) based on parameter value ‘NY’ for estimation (792000 rows) instead of parameter value ‘WA’ for estimation (8000 rows). This is because the estimation is based on the first set of parameter value supplied to the stored procedure which is ‘NY’ in this case. This overestimation leads to poor performance of this Hash Match operation, it might also affect the performance of other concurrently executing queries requiring memory and hence overestimation is not recommended.

 

 

The estimated number of rows, 792000 is much more than the actual number of rows of 8000.

 

Intermediate Summary: This issue can be avoided by not caching the plan for memory allocating queries. Other possibility is to use recompile hint or optimize for hint to allocate memory for predefined data range.Let’s recreate the stored procedure with recompile hint.

--Example provided by www.sqlworkshops.com

drop proc CustomersByState

go

create proc CustomersByState @State char(2) as

begin

declare @CustomerID int

select @CustomerID = e.CustomerID from Customers e

inner join CustomersState es on (e.CustomerID = es.CustomerID)

where es.State = @State

option (maxdop 1, recompile)

end

go

 

Let’s execute the stored procedure initially with parameter value ‘WA’ and then with parameter value ‘NY’.

--Example provided by www.sqlworkshops.com

exec CustomersByState 'WA'

go

exec CustomersByState 'NY'

go

 

The stored procedure took 297 ms and 1102 ms in line with previous optimal execution times.

 

The stored procedure with parameter value ‘WA’ has good estimation like before.

 

Estimated number of rows of 8000 is similar to actual number of rows of 8000.

 

The stored procedure with parameter value ‘NY’ also has good estimation and memory grant like before because the stored procedure was recompiled with current set of parameter values.

 

Estimated number of rows of 792000 is similar to actual number of rows of 792000.

 

 

The compilation time and compilation CPU of 1 ms is not expensive in this case compared to the performance benefit.

 

There was no Hash Warning in SQL Profiler.

 

Let’s recreate the stored procedure with optimize for hint of ‘NY’.

--Example provided by www.sqlworkshops.com

drop proc CustomersByState

go

create proc CustomersByState @State char(2) as

begin

declare @CustomerID int

select @CustomerID = e.CustomerID from Customers e

inner join CustomersState es on (e.CustomerID = es.CustomerID)

where es.State = @State

option (maxdop 1, optimize for (@State = 'NY'))

end

go

 

Let’s execute the stored procedure initially with parameter value ‘WA’ and then with parameter value ‘NY’.

--Example provided by www.sqlworkshops.com

exec CustomersByState 'WA'

go

exec CustomersByState 'NY'

go

 

The stored procedure took 353 ms with parameter value ‘WA’, this is much slower than the optimal execution time of 294 ms we observed previously. This is because of overestimation of memory. The stored procedure with parameter value ‘NY’ has optimal execution time like before.

 

The stored procedure with parameter value ‘WA’ has overestimation of rows because of optimize for hint value of ‘NY’.

 

Unlike before, more memory was estimated to this stored procedure based on optimize for hint value ‘NY’.

 

 

The stored procedure with parameter value ‘NY’ has good estimation because of optimize for hint value of ‘NY’. Estimated number of rows of 792000 is similar to actual number of rows of 792000.

 

Optimal amount memory was estimated to this stored procedure based on optimize for hint value ‘NY’.

 

There was no Hash Warning in SQL Profiler.

 

This article covers underestimation / overestimation of memory for Hash Match operation. Plan Caching and Query Memory Part I covers underestimation / overestimation for Sort. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.

 

Summary: Cached plan might lead to underestimation or overestimation of memory because the memory is estimated based on first set of execution parameters. It is recommended not to cache the plan if the amount of memory required to execute the stored procedure has a wide range of possibilities. One can mitigate this by using recompile hint, but that will lead to compilation overhead. However, in most cases it might be ok to pay for compilation rather than spilling sort over tempdb which could be very expensive compared to compilation cost. The other possibility is to use optimize for hint, but in case one sorts more data than hinted by optimize for hint, this will still lead to spill. On the other side there is also the possibility of overestimation leading to unnecessary memory issues for other concurrently executing queries. In case of Hash Match operations, this overestimation of memory might lead to poor performance. When the values used in optimize for hint are archived from the database, the estimation will be wrong leading to worst performance, so one has to exercise caution before using optimize for hint, recompile hint is better in this case.

 

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

R Meyyappan rmeyyappan@sqlworkshops.com

LinkedIn: http://at.linkedin.com/in/rmeyyappan

Plan Caching and Query Memory Part I – When not to use stored procedure or other plan caching mechanisms like sp_executesql or prepared statement

 

The most common performance mistake SQL Server developers make:

SQL Server estimates memory requirement for queries at compilation time. This mechanism is fine for dynamic queries that need memory, but not for queries that cache the plan. With dynamic queries the plan is not reused for different set of parameters values / predicates and hence different amount of memory can be estimated based on different set of parameter values / predicates. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. This article covers Sort with examples. It is recommended to read Plan Caching and Query Memory Part II after this article which covers Hash Match operations.

 

When the plan is cached by using stored procedure or other plan caching mechanisms like sp_executesql or prepared statement, SQL Server estimates memory requirement based on first set of execution parameters. Later when the same stored procedure is called with different set of parameter values, the same amount of memory is used to execute the stored procedure. This might lead to underestimation / overestimation of memory on plan reuse, overestimation of memory might not be a noticeable issue for Sort operations, but underestimation of memory will lead to spill over tempdb resulting in poor performance.

 

This article covers underestimation / overestimation of memory for Sort. Plan Caching and Query Memory Part II covers underestimation / overestimation for Hash Match operation. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.

 

To read additional articles I wrote click here.

 

In most cases it is cheaper to pay for the compilation cost of dynamic queries than huge cost for spill over tempdb, unless memory requirement for a stored procedure does not change significantly based on predicates.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Most of these concepts are also covered in our webcasts: www.sqlworkshops.com/webcasts

 

Enough theory, let’s see an example where we sort initially 1 month of data and then use the stored procedure to sort 6 months of data.

 

Let’s create a stored procedure that sorts customers by name within certain date range.

 

--Example provided by www.sqlworkshops.com

create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as

begin

      declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime

      select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c

            where c.CreationDate between @CreationDateFrom and @CreationDateTo

            order by c.CustomerName

      option (maxdop 1)

      end

go

Let’s execute the stored procedure initially with 1 month date range.

 

set statistics time on

go

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-01-31'

go

The stored procedure took 48 ms to complete.

 

  

The stored procedure was granted 6656 KB based on 43199.9 rows being estimated.

 

  

 

The estimated number of rows, 43199.9 is similar to actual number of rows 43200 and hence the memory estimation should be ok.

 

 

 

There was no Sort Warnings in SQL Profiler.

 

 

 

Now let’s execute the stored procedure with 6 month date range.

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-06-30'

go

The stored procedure took 679 ms to complete.

 

 

 

The stored procedure was granted 6656 KB based on 43199.9 rows being estimated.

 

 

 

The estimated number of rows, 43199.9 is way different from the actual number of rows 259200 because the estimation is based on the first set of parameter value supplied to the stored procedure which is 1 month in our case. This underestimation will lead to sort spill over tempdb, resulting in poor performance.

 

 

 

There was Sort Warnings in SQL Profiler.

 

 

To monitor the amount of data written and read from tempdb, one can execute select num_of_bytes_written, num_of_bytes_read from sys.dm_io_virtual_file_stats(2, NULL) before and after the stored procedure execution, for additional information refer to the webcast: www.sqlworkshops.com/webcasts.

 

 

Let’s recompile the stored procedure and then let’s first execute the stored procedure with 6 month date range.

 

In a production instance it is not advisable to use sp_recompile instead one should use DBCC FREEPROCCACHE (plan_handle). This is due to locking issues involved with sp_recompile, refer to our webcasts for further details.

 

exec sp_recompile CustomersByCreationDate

go

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-06-30'

go

Now the stored procedure took only 294 ms instead of 679 ms.

 

 

The stored procedure was granted 26832 KB of memory.

 

 

 

The estimated number of rows, 259200 is similar to actual number of rows of 259200. Better performance of this stored procedure is due to better estimation of memory and avoiding sort spill over tempdb.

 

 

 

There was no Sort Warnings in SQL Profiler.

 

  

 

Now let’s execute the stored procedure with 1 month date range.

 

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-01-31'

go

The stored procedure took 49 ms to complete, similar to our very first stored procedure execution.

 

 

 

This stored procedure was granted more memory (26832 KB) than necessary memory (6656 KB) based on 6 months of data estimation (259200 rows) instead of 1 month of data estimation (43199.9 rows). This is because the estimation is based on the first set of parameter value supplied to the stored procedure which is 6 months in this case. This overestimation did not affect performance, but it might affect performance of other concurrent queries requiring memory and hence overestimation is not recommended. This overestimation might affect performance Hash Match operations, refer to article Plan Caching and Query Memory Part II for further details. 

 

Let’s recompile the stored procedure and then let’s first execute the stored procedure with 2 day date range.

exec sp_recompile CustomersByCreationDate

go

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-01-02'

go

The stored procedure took 1 ms.

 

 

 

The stored procedure was granted 1024 KB based on 1440 rows being estimated.

 

 

 

There was no Sort Warnings in SQL Profiler.

 

 

 

Now let’s execute the stored procedure with 6 month date range.

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-06-30'

go

 

The stored procedure took 955 ms to complete, way higher than 679 ms or 294ms we noticed before.

 

 

 

The stored procedure was granted 1024 KB based on 1440 rows being estimated. But we noticed in the past this stored procedure with 6 month date range needed 26832 KB of memory to execute optimally without spill over tempdb. This is clear underestimation of memory and the reason for the very poor performance.

 

 

 

There was Sort Warnings in SQL Profiler. Unlike before this was a Multiple pass sort instead of Single pass sort. This occurs when granted memory is too low.

 

 

 

Intermediate Summary: This issue can be avoided by not caching the plan for memory allocating queries. Other possibility is to use recompile hint or optimize for hint to allocate memory for predefined date range.

 

Let’s recreate the stored procedure with recompile hint.

--Example provided by www.sqlworkshops.com

drop proc CustomersByCreationDate

go

create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as

begin

      declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime

      select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c

            where c.CreationDate between @CreationDateFrom and @CreationDateTo

            order by c.CustomerName

      option (maxdop 1, recompile)

      end

go

Let’s execute the stored procedure initially with 1 month date range and then with 6 month date range.

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-01-30'

exec CustomersByCreationDate '2001-01-01', '2001-06-30'

go

The stored procedure took 48ms and 291 ms in line with previous optimal execution times.

 

 

 

The stored procedure with 1 month date range has good estimation like before.

 

 

 

The stored procedure with 6 month date range also has good estimation and memory grant like before because the query was recompiled with current set of parameter values.

 

 

 

The compilation time and compilation CPU of 1 ms is not expensive in this case compared to the performance benefit.

 

 

Let’s recreate the stored procedure with optimize for hint of 6 month date range.

 

--Example provided by www.sqlworkshops.com

drop proc CustomersByCreationDate

go

create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as

begin

      declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime

      select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c

            where c.CreationDate between @CreationDateFrom and @CreationDateTo

            order by c.CustomerName

      option (maxdop 1, optimize for (@CreationDateFrom = '2001-01-01', @CreationDateTo ='2001-06-30'))

      end

go

Let’s execute the stored procedure initially with 1 month date range and then with 6 month date range.

 

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-01-30'

exec CustomersByCreationDate '2001-01-01', '2001-06-30'

go

The stored procedure took 48ms and 291 ms in line with previous optimal execution times.

 

 

The stored procedure with 1 month date range has overestimation of rows and memory. This is because we provided hint to optimize for 6 months of data.

 

 

 

The stored procedure with 6 month date range has good estimation and memory grant because we provided hint to optimize for 6 months of data.

 

  

 

Let’s execute the stored procedure with 12 month date range using the currently cashed plan for 6 month date range.

--Example provided by www.sqlworkshops.com

exec CustomersByCreationDate '2001-01-01', '2001-12-31'

go

The stored procedure took 1138 ms to complete.

 

 

 

2592000 rows were estimated based on optimize for hint value for 6 month date range. Actual number of rows is 524160 due to 12 month date range.

 

 

 

The stored procedure was granted enough memory to sort 6 month date range and not 12 month date range, so there will be spill over tempdb.

 

 

 

 There was Sort Warnings in SQL Profiler.

 

 

 

As we see above, optimize for hint cannot guarantee enough memory and optimal performance compared to recompile hint.

 

This article covers underestimation / overestimation of memory for Sort. Plan Caching and Query Memory Part II covers underestimation / overestimation for Hash Match operation. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.

 

Summary: Cached plan might lead to underestimation or overestimation of memory because the memory is estimated based on first set of execution parameters. It is recommended not to cache the plan if the amount of memory required to execute the stored procedure has a wide range of possibilities. One can mitigate this by using recompile hint, but that will lead to compilation overhead. However, in most cases it might be ok to pay for compilation rather than spilling sort over tempdb which could be very expensive compared to compilation cost. The other possibility is to use optimize for hint, but in case one sorts more data than hinted by optimize for hint, this will still lead to spill. On the other side there is also the possibility of overestimation leading to unnecessary memory issues for other concurrently executing queries. In case of Hash Match operations, this overestimation of memory might lead to poor performance. When the values used in optimize for hint are archived from the database, the estimation will be wrong leading to worst performance, so one has to exercise caution before using optimize for hint, recompile hint is better in this case.

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

 

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

 

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

R Meyyappan rmeyyappan@sqlworkshops.com

LinkedIn: http://at.linkedin.com/in/rmeyyappan

3 Day Level 400 SQL Tuning Workshop 15 March in London, early bird and referral offer

I want to inform you that we have organized the "3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop" in London, United Kingdom during March 15-17, 2011.

This is a truly level 400 hands-on workshop and you can find the Agenda, Prerequisite, Goal of the Workshop and Registration information at www.sqlworkshops.com/ruk
. Charges are GBP 1800 (VAT excl.). Early bird discount of GBP 125 until 18 February. We are also introducing a new referral plan. If you refer someone who participates in the workshop you will receive an Amazon gift voucher for GBP 125.

Feedback from one of the participants who attended our November London workshop:
Andrew, Senior SQL Server DBA from UBS, UK,
www.ubs.com, November 26, 2010:
Rating: In a scale of 1 to 5 please rate each item below (1=Poor & 5=Excellent)

Overall I was satisfied with the workshop

5

Instructor maintained the focus of the course

5

Mix of theory and practice was appropriate

5

Instructor answered the questions asked

5

The training facility met the requirement

5

How confident are you with SQL Server 2008 performance tuning

5

Additional comments from Andrew:

The course was expertly delivered and backed up with practical examples. At the end of the course I felt my knowledge of SQL Server had been greatly enhanced and was eager to share with my colleagues. I felt there was one prerequisite missing from the course description, an open mind since the course changed some of my core product beliefs.

For Additional workshop feedbacks refer to: www.sqlworkshops.com/feedbacks.

I will be delivering the Level 300-400 1 Day Microsoft SQL Server 2008 Performance Monitoring and Tuning Seminar at Istanbul and Ankara, Turkey during March. This event is organized by Microsoft Turkey, let me know if you are in Turkey and would like to attend.

During September 2010 I delivered this Level 300-400 1 Day Microsoft SQL Server 2008 Performance Monitoring and Tuning Seminar in Zurich, Switzerland organized by Microsoft Switzerland and the feedback was 4.85 out of 5, there were about 100 participants. During November 2010 when I delivered seminar in Lisbon, Portugal organized by Microsoft Portugal, the feedback was 8.30 out of 9, there were 130 participants.

Our Mission: Empower customers to fully realize the Performance potential of Microsoft SQL Server without increasing the total cost of ownership (TCO) and achieve high customer satisfaction in every consulting engagement and workshop delivery.

Our Business Plan: Provide useful content in webcasts, articles and seminars to get visibility for consulting engagements and workshop delivery opportunity. Help us by forwarding this email to your SQL Server friends and colleagues.

Looking forward

R Meyyappan & Team @
www.SQLWorkshops.com
LinkedIn:
http://at.linkedin.com/in/rmeyyappan

Posted by sqlworkshops | with no comments

More CPU cores may not always lead to better performance – MAXDOP and query memory distribution in spotlight

More hardware normally delivers better performance, but there are exceptions where it can hinder performance. Understanding these exceptions and working around it is a major part of SQL Server performance tuning.

 

When a memory allocating query executes in parallel, SQL Server distributes memory to each task that is executing part of the query in parallel. In our example the sort operator that executes in parallel divides the memory across all tasks assuming even distribution of rows. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union.

 

In reality, how often are column values evenly distributed, think about an example; are employees working for your company distributed evenly across all the Zip codes or mainly concentrated in the headquarters? What happens when you sort result set based on Zip codes? Do all products in the catalog sell equally or are few products hot selling items?

 

One of my customers tested the below example on a 24 core server with various MAXDOP settings and here are the results:
MAXDOP 1: CPU time = 1185 ms, elapsed time = 1188 ms
MAXDOP 4: CPU time = 1981 ms, elapsed time = 1568 ms
MAXDOP 8: CPU time = 1918 ms, elapsed time = 1619 ms
MAXDOP 12: CPU time = 2367 ms, elapsed time = 2258 ms
MAXDOP 16: CPU time = 2540 ms, elapsed time = 2579 ms
MAXDOP 20: CPU time = 2470 ms, elapsed time = 2534 ms
MAXDOP 0: CPU time = 2809 ms, elapsed time = 2721 ms - all 24 cores.
In the above test, when the data was evenly distributed, the elapsed time of parallel query was always lower than serial query.

 

Why does the query get slower and slower with more CPU cores / higher MAXDOP? Maybe you can answer this question after reading the article; let me know: rmeyyappan@sqlworkshops.com.

 

Well you get the point, let’s see an example.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001.

update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1

update Employees set Zip = 2001 where EmployeeID % 50 != 1

go

update statistics Employees with fullscan

go

 

Let’s create the temporary table #FireDrill with all possible Zip codes.

drop table #FireDrill

go

create table #FireDrill (Zip int primary key)

insert into #FireDrill select distinct Zip from Employees

update statistics #FireDrill with fullscan

go

 

Let’s execute the query serially with MAXDOP 1.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--First serially with MAXDOP 1

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 1)

go

The query took 1011 ms to complete.

 

The execution plan shows the 77816 KB of memory was granted while the estimated rows were 799624.

 

No Sort Warnings in SQL Server Profiler.

 

Now let’s execute the query in parallel with MAXDOP 0.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--In parallel with MAXDOP 0

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 0)

go

The query took 1912 ms to complete.

 

The execution plan shows the 79360 KB of memory was granted while the estimated rows were 799624.

 

The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead.

Sort properties shows the rows are unevenly distributed over the 4 threads.

 

Sort Warnings in SQL Server Profiler.

 

 

Intermediate Summary: The reason for the higher duration with parallel plan was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001.

Now let’s update the Employees table and distribute employees evenly across all Zip codes.

 

update Employees set Zip = EmployeeID / 400 + 1

go

update statistics Employees with fullscan

go

 

Let’s execute the query serially with MAXDOP 1.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--Serially with MAXDOP 1

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 1)

go

 

The query took 751 ms to complete.

 

The execution plan shows the 77816 KB of memory was granted while the estimated rows were 784707.

 

No Sort Warnings in SQL Server Profiler.

  

Now let’s execute the query in parallel with MAXDOP 0.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--In parallel with MAXDOP 0

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 0)

go

The query took 661 ms to complete.

 

The execution plan shows the 79360 KB of memory was granted while the estimated rows were 784707.

 

Sort properties shows the rows are evenly distributed over the 4 threads.

No Sort Warnings in SQL Server Profiler.

 

 

Intermediate Summary: When employees were distributed unevenly, concentrated on 1 Zip code, parallel sort spilled while serial sort performed well without spilling to tempdb. When the employees were distributed evenly across all Zip codes, parallel sort and serial sort did not spill to tempdb. This shows uneven data distribution may affect the performance of some parallel queries negatively. For detailed discussion of memory allocation, refer to webcasts available at www.sqlworkshops.com/webcasts.

 

 

Some of you might conclude from the above execution times that parallel query is not faster even when there is no spill. Below you can see when we are joining limited amount of Zip codes, parallel query will be fasted since it can use Bitmap Filtering.

 

Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001.

update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1

update Employees set Zip = 2001 where EmployeeID % 50 != 1

go

update statistics Employees with fullscan

go

 

Let’s create the temporary table #FireDrill with limited Zip codes.

drop table #FireDrill

go

create table #FireDrill (Zip int primary key)

insert into #FireDrill select distinct Zip

      from Employees where Zip between 1800 and 2001

update statistics #FireDrill with fullscan

go

 

Let’s execute the query serially with MAXDOP 1.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--Serially with MAXDOP 1

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 1)

go

The query took 989 ms to complete.

 

The execution plan shows the 77816 KB of memory was granted while the estimated rows were 785594.

No Sort Warnings in SQL Server Profiler.

 

Now let’s execute the query in parallel with MAXDOP 0.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--In parallel with MAXDOP 0

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 0)

go

The query took 1799 ms to complete.

 

The execution plan shows the 79360 KB of memory was granted while the estimated rows were 785594.

 

Sort Warnings in SQL Server Profiler.

 

 

The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead.

 

Intermediate Summary: The reason for the higher duration with parallel plan even with limited amount of Zip codes was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001.

 

Now let’s update the Employees table and distribute employees evenly across all Zip codes.

update Employees set Zip = EmployeeID / 400 + 1

go

update statistics Employees with fullscan

go

Let’s execute the query serially with MAXDOP 1.

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--Serially with MAXDOP 1

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 1)

go

The query took 250  ms to complete.

 

The execution plan shows the 9016 KB of memory was granted while the estimated rows were 79973.8.

 

No Sort Warnings in SQL Server Profiler.

 

Now let’s execute the query in parallel with MAXDOP 0.

 

--Example provided by www.sqlworkshops.com

--Execute query with uneven Zip code distribution

--In parallel with MAXDOP 0

set statistics time on

go

declare @EmployeeID int, @EmployeeName varchar(48),@zip int

select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e

      inner join #FireDrill fd on (e.Zip = fd.Zip)

      order by e.Zip

option (maxdop 0)

go

The query took 85 ms to complete.

 

The execution plan shows the 13152 KB of memory was granted while the estimated rows were 784707.

 

No Sort Warnings in SQL Server Profiler.

 

 

Here you see, parallel query is much faster than serial query since SQL Server is using Bitmap Filtering to eliminate rows before the hash join.

 

Parallel queries are very good for performance, but in some cases it can hinder performance. If one identifies the reason for these hindrances, then it is possible to get the best out of parallelism. I covered many aspects of monitoring and tuning parallel queries in webcasts (www.sqlworkshops.com/webcasts) and articles (www.sqlworkshops.com/articles). I suggest you to watch the webcasts and read the articles to better understand how to identify and tune parallel query performance issues.

 

Summary: One has to avoid sort spill over tempdb and the chances of spills are higher when a query executes in parallel with uneven data distribution. Parallel query brings its own advantage, reduced elapsed time and reduced work with Bitmap Filtering. So it is important to understand how to avoid spills over tempdb and when to execute a query in parallel.

 

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

 

R Meyyappan rmeyyappan@sqlworkshops.com

LinkedIn: http://at.linkedin.com/in/rmeyyappan

 

Parallel Sort and Merge Join – Watch out for unpredictability in performance

When SQL Server sorts or does merge join in parallel the query performance highly depends on other concurrent CPU intensive activities taking place on the server.

 

In the below example you will see that when we execute a query with MAXDOP 0 it completes in less than a second. When we have a CPU intensive query executing over a CPU core, the above query with MAXDOP 0 takes 42 seconds. In reality it need not just be 42 seconds, it could be many minutes or hours depending on data volume.

 

Enough explanation, let’s see a real world example.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

Let’s execute the query with MAXDOP 0:

 

The reason I am using convert is to trick the optimizer to grant additional memory. The memory estimated by the optimizer without convert is not enough to perform an in-memory sort. The sort spills to tempdb without the convert leading to non optimal performance. You can learn more about this from my webcasts or watching my SQL Bits V session at http://go.microsoft.com/?linkid=9712350 or reading the MSSQL Tip based on our webcast, http://www.mssqltips.com/tip.asp?tip=1955 (written by Matteo Lorini).

 

set statistics time on

go

 

--Example provided by www.sqlworkshops.com

declare @order_date datetime, @order_status int, @order_id int,

      @order_description varchar(4000)

select @order_date = order_date, @order_status = order_status,

            @order_id = order_id, @order_description = convert(varchar(4000), order_description)

      from Orders

      where order_date between '2005-01-01' and '2005-06-30'

      order by order_id

      option (maxdop 0)

go

--Query takes less than a second

 

Now let’s execute a CPU intensive query over another session:

--Example provided by www.sqlworkshops.com

--Execute this CPU intensive query on a different session

declare @i int

while 1=1

select @i = count(*) from sys.objects option (maxdop 1)

go

 

 

Now let’s execute the above query with MAXDOP 0:

--Example provided by www.sqlworkshops.com

declare @order_date datetime, @order_status int, @order_id int,

      @order_description varchar(4000)

select @order_date = order_date, @order_status = order_status,

            @order_id = order_id, @order_description = convert(varchar(4000), order_description)

      from Orders

      where order_date between '2005-01-01' and '2005-06-30'

      order by order_id

      option (maxdop 0)

go

 

--Query takes about 42 second

--How many seconds did it take on your server ?

The reason for the performance issue is the way SQL Server implements merging or order preserving exchanges and SQL Server’s CPU/thread scheduling architecture. You can see our ‘Gather Streams’ has ‘Order By’, the sorted output of the parallel thread has to be merged by ‘Gather Streams’ maintaining the sorted order.

 

Every parallel thread supplies a page of sorted rows (packet) and the ‘Gather Streams’ merges these pages. Once ‘Gather Streams’ runs out of rows from a thread it waits for the next page of sorted rows from that thread. If that thread (which has to supply the next page of sorted rows) is executing on a CPU core that is busy executing a CPU intensive query, then the thread has to wait up to 4ms to supply the next page of sorted rows.

 

Duration depends on how many CPU cores you have, with more cores, each thread sorts less rows.

 

4ms is the scheduling quantum in SQL Server. Refer to the article www.sqlworkshops.com/cpupressure I wrote a while ago and the webcasts for additional details.

 

Some of you might think well in that case I will not execute queries in parallel, will set MAXDOP to 1. There are cases where you have to use all the CPU cores (return on investment (ROI), Maximizing your existing hardware CPU, memory and disk); you cannot tell your customers ‘well everyone has to wait while I execute this report using 1 CPU core keeping the rest of the CPU cores idle for the next hour’. You have to find creative ways to resolve these issues. One way to solve the issue and bring back query performance predictability is to partition the workload; I discuss this in my webcast (www.sqlworkshops.com/webcasts).

 

This issue is just not limited to Sort; below you can see SQL Server chooses to execute the query in parallel using merge join.

 

The query executes faster with merge join when the CPU intensive query is not executing:

 

The query does parallel merge join:

 

When we force hash join, the query performance is similar:

 

The query does parallel hash join:

 

Let's execute the CPU intensive query on a CPU core:

 

 

Now the Merge join query is very slow, takes in my case 19 seconds:

 

In the execution plan you can see merge join:

 

Now when we force a hash join, the query is fast.

 

In the execution plan you can see hash join:

 

This is because hash join does not suffer from CPU/thread scheduling architecture like merge join. Parallel hash join suffers from memory pressure due to uneven row distribution, similar to parallel sort, which I explain in my webcasts.

 

In a situation like above, when the cost of parallel merge and parallel hash are close, one can choose hash (with an hint) as it gives performance predictability. And again the other possibility is to partition the workload like I discuss in the webcasts.

 

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

Ramesh Meyyappan (http://www.sqlworkshops.com/instructor) is a Microsoft SQL Server specialist with expertise in Performance Monitoring, Tuning & Troubleshooting. Ramesh conducts workshops (http://www.sqlworkshops.com/schedule) on SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning. Ramesh provides onsite consulting services in Europe and also offsite worldwide. You can contact Ramesh at http://www.sqlworkshops.com/contacts.

 

In London, during November 17-19, 2010 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Workshop

We are organizing the 3 Day Truly Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Workshop in London, United Kingdom during November 17-19, 2010.

 

Let us know if you or one of your colleagues is interested in participating in the workshop.

 

You can register for the workshop with this link: http://www.sqlworkshops.com/ruk. Past workshop feedbacks are available at www.sqlworkshops.com/feedbacks.

 

Ramesh Meyyappan
www.sqlworkshops.com

Posted by sqlworkshops | with no comments

Flying with Plan Freezing – Mostly you experience thrust or stay afloat, but rarely this add-on can be a drag

Plan freezing is an interesting feature, targeting plan stability, performance predictability. We all want stability, especially during the good times. There are exceptions, there are always exceptions, that’s what software testing is all about, there is no software without flaws, you have to learn to identify and navigate around the flaws. Most of the time these flaws will be eventually corrected (sometimes they do get marked ‘By Design/Feature’!), but a mission critical application can’t wait for the service pack or hotfix, it needs to perform today, now!

 

As I mentioned in the previous article ‘Prefetch – Querying at the speed of SAN’, plan freezing will not help you with estimation. SQL Server will still estimate based on current set of parameter values (compile time parameter values) and not based on the plan you use for freezing.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

Optimized Bitmap Filtering is an interesting feature, if you execute the below query with MAXDOP 1, Bitmap Filtering will not happen and the query will consume more CPU resource.

 

You have to execute all the below example queries with parallelism enabled (with MAXDOP 0 or > 1). I did not use hint OPTION (MAXDOP 0) as my SQL Server Configuration Parameter ‘max degree of parallelism’ is set to 0.

 

Here is a query that uses the Optimized Bitmap Filter (that’s new in SQL Server 2008):

 

set statistics time on

go

 

--Example provided by www.sqlworkshops.com

declare @i int

select @i = f1.c1 from Fact1 f1

      inner join Dim1 d1 on (f1.c2 = d1.c2)

      inner join Dim2 d2 on (f1.c3 = d2.c2)

      where d1.c3 between 100000 and 110000 and d2.c3 between 200000 and  210000

go

 

--The above query takes 201ms of CPU

 

 

--There are 2 Optimized Bitmap Filters

--eliminating non qualifying rows from

--table Dim1(Fact1.c2) and Dim2(Fact1.c3)

 

--Hash join bottom input has only 63 rows

 

You can test Plan Freezing this with sp_create_plan_guide or sp_create_plan_guide_from_handle or simply with USE PLAN hint. We will use ‘USE PLAN’ hint.

 

You need to get the XML plan from the above query and then use than XML Plan in the below query. Make sure you disable 'Include Actual Execution Plan' in SQL Server Management Studio to get the XML plan.

 

--Example provided by www.sqlworkshops.com

--You need to turn off 'Include Actual Execution Plan'

--or Graphical plan and then execute

--set set statistics xml on

--to get the xml plan

--in SQL Server Management Studio.

 

set statistics xml on

go

 

--Example provided by www.sqlworkshops.com

--let’s execute the above query again

declare @i int

select @i = f1.c1 from Fact1 f1

      inner join Dim1 d1 on (f1.c2 = d1.c2)

      inner join Dim2 d2 on (f1.c3 = d2.c2)

      where d1.c3 between 100000 and 110000 and d2.c3 between 200000 and  210000

go

 

Right click on the XML Plan, do not left click, right click and choose Copy and paste the XML Plan in the OPTION (USE PLAN ‘<...XMLPlan...>’) clause between ‘<...XMLPlan...>’.

 

--Example provided by www.sqlworkshops.com

--Cut and paste the above xml plan here

declare @i int

select @i = f1.c1 from Fact1 f1

      inner join Dim1 d1 on (f1.c2 = d1.c2)

      inner join Dim2 d2 on (f1.c3 = d2.c2)

      where d1.c3 between 100000 and 110000 and d2.c3 between 200000 and  210000

      option (use plan '<...XMLPlan...>')

go

 

--The above query takes 331ms of CPU

 

 

--Notice There is only one Bitmap Filter

--not the 2008 Optimized Bitmap Filter

--eliminating non qualifying rows only

--from table Dim2(c3) and not from

--table Dim1(c2)

--Hash join bottom input has only 212835 rows

--to join instead of 63 rows like before

 

Duration may not be much different in our example even though we are joining less rows. This is because we have not that many rows like a real world OLAP database and we are using integer columns for the join and not large columns.

 

 

Remember the last time, you found something good, you liked it, you froze it and then it didn’t taste as good as it was before. Well this can happen with real world SQL Server queries as well, keep your eyes open and keep learning.

 

I recommend you to watch my webcasts (www.sqlworkshops.com/webcasts). The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

Ramesh Meyyappan (http://www.sqlworkshops.com/instructor) is a Microsoft SQL Server specialist with expertise in Performance Monitoring, Tuning & Troubleshooting. Ramesh conducts workshops (http://www.sqlworkshops.com/schedule) on SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning. Ramesh provides onsite consulting services in Europe and also offsite worldwide. You can contact Ramesh at http://www.sqlworkshops.com/contacts.

 

 

Posted by sqlworkshops | with no comments

Prefetch – Querying at the speed of SAN

Prefetch is a mechanism with which SQL Server can fire up many I/O requests in parallel for Nested Loop join.

 

The SAN administrator says your data volume has a throughput capacity of 400MB/sec. But your long running query is waiting for I/Os (PAGEIOLATCH_SH) and Windows Performance Monitor shows your data volume is doing 4MB/sec. Where is the problem?

 

When SQL Server does a Nested Loop join, it might enable Prefetch. Prefetch is a functionality where SQL Server fires up many I/O requests in parallel (many outstanding I/Os). Prefetch will lead to better performance for Nested Loop join queries when there are lots of rows in the outer input table. 25 is the magic number. When SQL Server estimates less than 25 rows for the outer input table Prefetch will be disabled. And when more than 25 rows are estimated Prefetch will be enabled.

 

This is a perfect design (yes, I am an ex-Microsoft employee!); SQL Server wants to avoid Prefetch for light weight queries, especially in an OLTP environment as too many queries doing Prefetch will hurt performance. I remember from my days at SQL Server Development team, OLTP benchmarks including latest TPC-E disables Prefetch with trace flag 8744. Write me an email and I will send you the link.

 

Why this is a problem: I observe this at many customers; plan is cached with Prefetch disabled because first execution resulted in less than 25 rows estimation for the outer input table. Then the plan is used for parameters that are resulting in more than 25 rows, actually 1000s of rows for the outer input table. In this case since SQL Server cached the plan, Prefetch is disabled and SQL Server is firing up 1 I/O at a time. This leads to poor utilization of SAN. Symptoms: Query is waiting for I/Os (PAGEIOLATCH_SH) for a long time (reading page after page), but disk queue length is never more than 1.

 

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

 

--Example provided by www.sqlworkshops.com

--Create procedure that pulls orders based on City

drop proc RegionalOrdersProc

go

create proc RegionalOrdersProc @City char(20)

as

begin

declare @OrderID int, @OrderDetails char(200)

select @OrderID = o.OrderID, @OrderDetails = o.OrderDetails

      from RegionalOrders ao inner join Orders o on (o.OrderID = ao.OrderID)

      where City = @City

end

go

 

set statistics time on

go

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter SmallCity1

exec RegionalOrdersProc 'SmallCity1'

go

 

--Right click on Clustered Index Scan

--to look at the properties

--Estimated number of rows in the

--outer input table is less than 25

 

--Right click on Nested Loops

--to look at the properties

--Notice Prefetch is disabled (missing)

 

--Clear data cache

checkpoint

dbcc dropcleanbuffers

go

 

Think about a business running a billing job looping over thousands of their customers using stored procedure. If the first customer has placed less than 25 orders for the billing period then Prefetch will be disabled and every customer is limited to using 1 spindle at a time.

 

 

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter BigCity

--We are using cached plan

exec RegionalOrdersProc 'BigCity'

go

 

--Estimated number of rows in the

--outer input table is less than 25

 

--Notice Prefetch is disabled (missing)

 

--Query execution time 7 seconds

--What was the execution time?

 

--Clear procedure cache

--to trigger a new optimization

dbcc freeproccache

go

 

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter SmallCity2

exec RegionalOrdersProc 'SmallCity2'

go

 

--Clear data cache

checkpoint

dbcc dropcleanbuffers

go

 

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter SmallCity

--We are using cached plan

exec RegionalOrdersProc 'BigCity'

go

 

--Estimated number of rows in the

--outer input table is greater than 25

 

--Notice Prefetch is enabled

 

--Query execution time 3 seconds

--On a SAN it was less than a second

--What was the execution time?

 

--Example provided by www.sqlworkshops.com

--You can fix the issue by using any of the following

--hints

--Create procedure that pulls orders based on City

drop proc RegionalOrdersProc

go

create proc RegionalOrdersProc @City char(20)

as

begin

declare @OrderID int, @OrderDetails char(200)

select @OrderID = o.OrderID, @OrderDetails = o.OrderDetails

      from RegionalOrders ao inner join Orders o on (o.OrderID = ao.OrderID)

      where City = @City

--Hinting optimizer to use SmallCity2 for estimation

      option (optimize for (@City = 'SmallCity2'))

--Hinting optimizer to estimate for the currnet parameters

      --option (recompile)

--Hinting optimize not to use histogram rather

--density for estimation (average of all 3 cities)

      --option (optimize for (@City UNKNOWN))

      --option (optimize for UNKNOWN)

end

go

 

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter SmallCity1

exec RegionalOrdersProc 'SmallCity1'

go

 

--Clear data cache

checkpoint

dbcc dropcleanbuffers

go

 

--Example provided by www.sqlworkshops.com

--Execute the procedure with parameter BigCity

--We are using cached plan

exec RegionalOrdersProc 'BigCity'

go

 

--Notice Prefetch is enabled

--Estimated number of rows in the

--outer input table is greater than 25

--Query execution time 3 seconds

--On a SAN it was less than a second

--What was your execution time?

 

Some of you might think plan guides or plan freezing might solve this issue, well it won’t help here. The estimation from your plan guide plan will be ignored by the optimizer. Optimizer makes new estimation based on current set of parameter values (compile time parameter values).

 

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

 

If you search for trace flag 8744 you will end up with a KB Article 920093 titled: Tuning options for SQL Server 2005 and SQL Server 2008 when running in high performance workloads. There it is explained: Trace flag 8744 disables pre-fetching for the Nested Loops operator (with caution). Even though the topic looks interesting ‘Tuning options for high performance workloads’, don’t try these in production. From my point of view this article is about documenting undocumented trace flags used in benchmarks, let’s leave it at that!

 

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

Ramesh Meyyappan (http://www.sqlworkshops.com/instructor) is a Microsoft SQL Server specialist with expertise in Performance Monitoring, Tuning & Troubleshooting. Ramesh conducts workshops (http://www.sqlworkshops.com/schedule) on SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning. Ramesh provides onsite consulting services in Europe and also offsite worldwide. You can contact Ramesh at http://www.sqlworkshops.com/contacts.

 

 

June 22-24, 2010 in London City Level 400 SQL Server Performance Monitoring & Tuning Workshop

We are organizing the “3 Day Level 400 SQL Server Performance Monitoring & Tuning Workshop” for the 1st time in London City during June 22-24, 2010.
Agenda is located @ www.sqlworkshops.com/workshops & you can register @ www.sqlworkshops.com/ruk. Charges: £ 1800 (5% discount for those who register before 21st May, £ 1710).

In this 3 Day Level 400 hands-on workshop, unlike short SQLBits sessions, we go deeper on the tuning topics. Not sure if this will be a good use of your time & money? Watch our webcasts @ www.sqlworkshops.com/webcasts.
We are trying to balance these commercial offerings with our free community contributions. Financially: These workshops are essential for us to stay in business!

Feedback from Finland workshop posted by Jukka, Wärtsilä Oyj on February 23, 2010 to the LinkedIn SQL Server User Group Finland (more feedbacks @ www.sqlworkshops.com/feedbacks):
Just want to start this thread and give some feedback on the Workshop that I attended last week at Microsoft.
Three days in a row, deep dive into the query optimization and performance monitoring :-) I must say, that the SQL guru Ramesh has all the tricks up in his sleeves.
The workshop was very helpful and what's most important: no slide show marathon: samples after samples explained very clearly and with our own class room SQL servers we can try the same stuff while Ramesh typed his own samples.
If the workshop will be rearranged, I can most willingly recommend it to anyone who wants to know what's "under the hood" of SQL Server 2008.
Once again, thank you Microsoft and Ramesh to make this happen. May the force be with us all :-)

Hope to see you @ the Workshop. Feel free to pass on this information to your SQL Server colleagues.

-ramesh-
www.sqlbits.com/speakers/r_meyyappan/default.aspx

More Posts Next page »