Conditional page breaks in SSRS are always something developer needs to work on in order to get it done. This is especially true when you need to render the same report in PDF.

Recently I need to implement the same and did that with a quick trick. Here is the sample SELECT I am using to demonstrate the same.


select 'Record 1' as record, 0 as PageBreak, 1 as grp
UNION ALL
select 'Record 2' as record, 0 as PageBreak, 1 as grp
UNION ALL
select 'Record 3' as record, 0 as PageBreak, 1 as grp
UNION ALL
select 'Record 4' as record, 1 as PageBreak, 2 as grp
UNION ALL
select 'Record 5' as record, 0 as PageBreak, 3 as grp
UNION ALL
select 'Record 6' as record, 0 as PageBreak, 3 as grp
UNION ALL
select 'Record 7' as record, 1 as PageBreak, 4 as grp
UNION ALL
select 'Record 8' as record, 0 as PageBreak, 5 as grp

 "Record" column is just to show unique records. The most important column here is "PageBreak", which is setting the conditional page break with "1". That means whenever there is "1" in "pagebreak" column you need a page break otherwise keep on displaying the records on same page. 

The other very important column is "grp". Well, you really don't need this column and still will be able to implement conditional page break functionality. But then SSRS will club all the records which requires page break into 1 grouping set, probably in the end with this sample data. So, in order to maintain the sequence of records we need to add a grouping column which will create a new group whenever it finds a different value in "pagebreak" column.

Now, lets move on to SSRS report. These steps will work in SSRS 2008. I am not adding standard steps of creating a data source and data set. I know you are all smart guys...

Simple enough, please follow these steps.

  1. Add a table with 2 columns, record and pagebreak.
  2. Now run the report and you will see all the records are coming in same page.
  3. Next, add a parent group to this row, parent to "Details" group in "Row Groups". Please check "Image 1".
  4. In the group expression select the "grp" column and the sorting, select "grp" in Sort by. Please check "Image 2".
  5. MOST Important, select "Between each instance of a group" of the newly create group under "Page Break" tab. Please check "Image 3".

Yep, that's it, run the report now. And you will see 3 records on page 1, 1 record on page 2, 2 records on page 3 and 1 record on page 4 and page 5.


image

Image 1


image

Image 2


image

Image 3


There are also other ways of doing the same, but I find this one very easy to implement.

Don't forget to hide new group by just minimizing the width and you are good to go. Try to render the same in PDF, and it works like a charm....

 

- Mohit Nayyar

In some cases we need to generate the data based on a starting date, to an end data (today-getdate()). This process is called data extrapolation based on limited information (starting date). Here I am trying to achieve the same without storing data in temp./staging tables.

First I am creating a temp. table wih 2 rows for Demo and now I need to generate rest of the rows starting from given date to today's date.

Next step is to make use of CTE to generate sequence of rows and finally making use of CROSS JOIN to join the sequence with real data and get the desired output.

CREATE TABLE #TEMP (
  MYNAME    VARCHAR(10),
  STARTDATE DATETIME)

GO

INSERT INTO #TEMP
VALUES     ('Mohit',
            '2008-09-15')

INSERT INTO #TEMP
VALUES     ('Nayyar',
            '2008-09-22')

GO

WITH FIRST2(SEQ)
     AS (SELECT 1
         UNION ALL
         SELECT 1),
     FIRST4(SEQ)
     AS (SELECT 1
         FROM   FIRST2 X
                CROSS JOIN FIRST2 Y),
     FIRST16(SEQ)
     AS (SELECT 1
         FROM   FIRST4 X
                CROSS JOIN FIRST4 Y),
     FIRST256(SEQ)
     AS (SELECT 1
         FROM   FIRST16 X
                CROSS JOIN FIRST16 Y),
     FIRST65536(SEQ)
     AS (SELECT 1
         FROM   FIRST256 X
                CROSS JOIN FIRST256 Y),
     SEQROWS
     AS (SELECT ROW_NUMBER
()
                  OVER(ORDER BY SEQ) AS SEQNUMBER
         FROM   FIRST65536 WITH (NOEXPAND))
        
SELECT   MYNAME,
         STARTDATE,
         DATEADD
(DAY,N.SEQNUMBER - 1,STARTDATE)  AS EXTRAPOLATION
FROM     #TEMP
         CROSS JOIN SEQROWS N
WHERE    N.SEQNUMBER <= DATEDIFF
(DAY,STARTDATE,GETDATE())
ORDER BY MYNAME,
         STARTDATE,
         3
GO

DROP TABLE #TEMP
GO 
 

 

- Mohit Nayyar

I recently found really useful information on temp. table and table variable usage. I am sure this will be useful to SQL Community as well.

http://www.sqlservercentral.com/articles/63472/

 

- Mohit 

 

 

While loading data in Fact tables we usually see a scenario where the fact data is available but there is no corresponding business key in the related dimension.

In this case we choose multiple options to resolve the issue.

  1. Ignore that fact
  2. Insert the associated business key in dimension table and return the newly generated surrogate key from dimension table. And now store the data in Fact table with the surrogate key.

The second approach relates to a term called “Inferred members”. All the other attributes of that dimension will also get updated in next run of dimension load (usually nightly load).

In SSIS there are multiple options available to implement the second case.

First approach is to do lookup on the dimension table and for all the rows that are now matching, insert the business key in Dimension table and then do the lookup again to get the surrogate key.

SSIS Inferred Members1 Mohit Nayyar.GIF


Second approach is to make use of Lookup and Script component. Lookup component will ignore rows with no matching business key in dimension table. Then script component will process only those rows where it didn’t find the associated surrogate key and finally insert the same in dimension table and return the associated surrogate key through stored procedure output parameter.

SSIS Inferred Members2 Mohit Nayyar.GIF


This script component approach is more efficient because its using the existing lookup component only once and then doing all the processing in script component.

But the additional benefit comes if we make use of .Net Generic.SortedDictionary class to store the cache information regarding the newly generated key. Read more about this here…
http://msdn.microsoft.com/en-us/library/f7fta44c(VS.80).aspx

SSIS Inferred Members3 Mohit Nayyar.GIF

 

- Mohit

 

Sometime its not feasible to process the full cube, so it makes sense to only process the selective measure group in the cube.

 

This is especially true when you have really big cube which takes time to process and sometime we just need to load data for only few business metrics.

 

Here is the small example to process only selective measure groups using “SSIS Analysis Services Processing Task”.

 

I am assuming that the measure group names are stored in a variable and after that we need to create “Processing Commands” to process the cube.

 

Processing Commands: We need to set this property in “Analysis Services Processing Task” to process the measure groups. So I will be using “Script Task” to generate the command.

 

Just make sure that the “Delay Validation” property of “Analysis Services Processing Task” is set to “True”.

 

Follow these steps now……..

 

  1. Get comma separated list of all measure groups needs to be processed in local SSIS variable let say “varMeasureGroups”. You can get this list from some sort of configuration, so no hard-coding ;)
  2. Creates the command using “Script Task” and store the final command in local SSIS variable let say “varCubeCommand”
  3. In “Analysis Services Processing Task” task set the property “Processing Commands” using expressions to the recently created variable named “varCubeCommand”.
  4. That’s it………no need to process the full cube now………

 

 

     Public Sub Main()

        Try

            Dim strSplitMeasureGroup As String(), i As Integer, strCubeCommand As String

 

            strSplitMeasureGroup = Split(Dts.Variables("User::varMeasureGroups").Value.ToString, ", ")

            strCubeCommand = "<Batch xmlns=""http://schemas.microsoft.com/analysisservices/2003/engine"">"

 

            For i = 0 To strSplitMeasureGroup.Length - 1

                strCubeCommand = strCubeCommand & "<Process xmlns:xsd=""http://www.w3.org/2001/XMLSchema"" "

                strCubeCommand = strCubeCommand & "xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"">" & Chr(13)

                strCubeCommand = strCubeCommand & "<Object>" & Chr(13)

                strCubeCommand = strCubeCommand & "<DatabaseID>myCube</DatabaseID>" & Chr(13)

                strCubeCommand = strCubeCommand & "<CubeID>myCube</CubeID>" & Chr(13)

                strCubeCommand = strCubeCommand & "<MeasureGroupID>" & strSplitMeasureGroup(i) & _

                "</MeasureGroupID>" & Chr(13)

                strCubeCommand = strCubeCommand & "</Object>" & Chr(13)

                strCubeCommand = strCubeCommand & "<Type>ProcessFull</Type>" & Chr(13)

                strCubeCommand = strCubeCommand & "<WriteBackTableCreation>UseExisting</WriteBackTableCreation>" & Chr(13)

                strCubeCommand = strCubeCommand & "</Process>" & Chr(13)

            Next

            strCubeCommand = strCubeCommand & "</Batch>"

 

            Dts.Variables("User::varCubeCommand").Value = strCubeCommand

            Dts.Events.FireInformation(1, "", Dts.Variables("User::varCubeCommand").Value.ToString, "", 0, False)

 

            Dts.TaskResult = Dts.Results.Success

 

        Catch ex As Exception

 

            Dts.Events.FireInformation(1, "", ex.Message, "", 0, False)

            Dts.TaskResult = Dts.Results.Failure

 

        End Try

    End Sub

 

 

 

 - Mohit Nayyar

 

Posted by mohitnayyar | with no comments
Filed under: ,

Recently I found this issue of deadlock variables in SSIS and finally made two solutions

1. Instead of declaring variables in Script properties (ReadOnlyVariables / ReadWriteVariables), better to make use of Dts.VariableDispenser.LockForRead / Dts.VariableDispenser.LockForWrite in the script to lock variables

2. I also solved my problem by running child packages out-of-proc (ExecuteOutOfProcess=TRUE in Execute Package Task), this is more to do with script caching.

But recently I saw something strange and found out that one of my script is making use of this type of code and I am getting this error again.....

"The script threw an exception: A deadlock was detected while trying to lock variables "variable names (comma separated)" for read access and variables "variable names (comma separated)" for read/write access. A lock cannot be acquired after 16 attempts. The locks timed out."

    Dim var As Variables

 

    Dts.VariableDispenser.LockForWrite("User::VariableName")

  

    Dts.VariableDispenser.GetVariables(var)

  

      'Problem Line......Just remove the reference to Dts.Variables

      Dts.Variables("User::VariableName").Value = "Some Value"

 

      'Correct Line.......make use of locally declared "var" variable

      var("User::VariableName").Value = "Some Value"

   

   var.Unlock()

  

To solve the situation I made a quick fix and change my Dts.Variables to local variable collection I defined in the beginning "var".

As we can see in this script that we are locking one variable for write. But after using the GetVariables we MUST use "var" collection to write data and should NOT use Dts.Variables and that fixed my problem.

So I think this is some sort of double locking issue, like if I lock the variable using "Dts.VariableDispenser" and if I try to access the same variable again with Dts.Variables instead of "var" then SSIS tries to lock that again......

 

 - Mohit Nayyar

Posted by mohitnayyar | with no comments
Filed under: ,

I just saw this new query tool from SQL Answers.

 

The beauty of this tool is to directly generate the output of any query into PDF / HTML / Excel / Word / CSV format. Please take a look here…....

 

http://www.sqlanswers.com/Software/SAQ/Default.aspx

 

Mohit Nayyar SQL Query Tool

 

This seems to be a lightweight tool to query multiple databases platforms like Microsoft SQL Server, Oracle, DB2, Microsoft Access and MySQL, but I just tested this for Microsoft SQL Server 2005 and I am really happy with this.

 

This makes things really easy when you need to send the output in PDF or HTML without any extra effort.

 

On the similar lines Microsoft also offers a “SQL Server Web Data Administrator” tool, which is very useful because of its basic web interface. You can find this here…….

 

http://www.microsoft.com/Downloads/details.aspx?familyid=C039A798-C57A-419E-ACBC-2A332CB7F959&displaylang=en

 

 

- Mohit Nayyar 

Posted by mohitnayyar | with no comments
Filed under:

Today I saw a new product coming soon to do DTS to SSIS migration called DtsXchange

Looks promising and this is also capable to do something extra which built-in DTS migration doesn’t do like Dynamic Properties. Have a look here http://www.pragmaticworks.com/dtsxchange.htm

Users can also migrate existing DTS packages into SSIS using the migration wizard provided by SQL Server 2005 but this wizard doesn’t cover complete DTS package. So in some cases manual effort is required, and SSIS doesn’t support some of the DTS features, so the user has to manually implement the functionality in new SSIS package.

You can find some known issues in migration here http://technet.microsoft.com/en-us/library/ms143462.aspx

But on top of this SSIS allows you to run the existing DTS package even without any change, using a wrapper called “Execute DTS 2000 Package Task” in the new SSIS package.

So if you don’t have the bandwidth to migrate the existing package then keep on running the old DTS packages in SSIS and start making new packages in SSIS.

Obviously you will miss the “All NEW SSIS” but it’s an easy work around. But I think you will miss a lot, because SSIS is NOT a new name of DTS but it's a truly enterprise level ETL tool. So its always better to migrate to SSIS as soon as possible.


- Mohit Nayyar

Posted by mohitnayyar | with no comments

At some point of time we all have to fine tune our old SQL Server queries or some of our colleague’s queries ;)

 

So I thought of writing something about common query performance problems and a checklist which can be used a make sure we didn’t miss anything.

 

1.

Large Result Set

Common mistake to include * instead of specific column names takes first place in investigation for performance issue.

 

We should always use specific column required by the application instead of asking all available columns.

 

Using * get worst when there are image/text/binary or very large data

columns available in the table.

 

Making use of TOP / TOP Percent / SET ROWCOUNT is another smart move to restrict the data based on technical solution.

2.

Missing Where Clause

We should always include the specific criteria when asking for rows.

Making use of WHERE clause is very beneficial when we have appropriate indexes on the columns in WHERE and SELECT clause.

3.

Lack of useful statistics

This is again very important point, statistics are the key for query optimizer to make an execution plan and get the data out.

 

SQL Server automatically creates useful statistics, but just makes sure that “Auto Create Statistics” is set to “TRUE” in database properties.

4.

Out-of-date statistics

This is again very important point, statistics are the key for query optimizer to make an execution plan and get the data out.

 

SQL Server automatically creates useful statistics, but just makes sure that “Auto Update Statistics” is set to “TRUE” in database properties.

5.

Lack of useful indexes

Again Indexes are the key to do any data I/O related operation. So just make sure you have all required indexes in place.

 

Just make sure you have created required clustered / non-clustered indexes based on the business requirement and technical solution implementation.

 

And try to create indexes on smallest possible numeric columns with respect to indexing requirement. I recommend reading BOL to understand the indexes thoroughly.

6.

Missing Joins

Yeah, that’s again is a good point to check. I have seen developers who simply put statements like “select * from table1, table2, table3”.

 

Never do that, just make sure you have followed the proper join and I recommend ANSI joins instead of “=, *=, =*” type joins. They are easy to read/manage and sort of required for future SQL Server versions.

7.

Blocking

Transactions’ fighting for resources is nothing new in any RDBMS application but we need to make sure that make use of all possible options to avoid excessive blocking.

8.

Query / Table Hints

They can be good or bad, based on the scenario. But it’s recommended that we should not use them extensively unless we have excellent understanding of technical implantation and future business requirement and data load trends.

SQL Server takes care of this very well, so better don’t use.

9.

Use best possible data type

We are talking about database systems, so taking care of what we are trying to store is very important in designing the whole system.

 

We should always choose the best possible smallest data type based on the scenario. Using INT for storing something like number 2000 is not a good option.

10.

Index Fragmentation

Hmm, not only having indexes solves the whole problem, we need to make sure that indexes are not fragmentated as well.

 

DBCC SHOWCONTIG and some of the latest DMVs are very useful in checking the fragmentation. And then re-building or defragmenting the index will solve the problem easily.

11.

Cursors

We should use cursors only when there is no option left and that too using READ ONLY, FAST FORWARD is recommended. And please make sure that we should only ask for smallest result-set possible using WHERE clause and with specific columns name instead of using star (*) in cursor definition.

 

Otherwise cursors put extra overhead on SQL Server and make extensive use of TempDB.

12.

Extensive temp table (tempdb) usage

Big bulky queries with lots of ordering, grouping clauses which result in huge result set can make use of TempDB extensively.

 

So try to use simple, smart queries which generate small result set and if this is not possible then make sure TempDb is on fastest possible disk system with setting a optimum size for the TempDB based on the technical solution.

 

Using TABLE data type is another good option for storing temporary result set. CTE (Common Table Expression), new to Microsoft SQL Server 2005 can also be used to achieve the same to some extent.

13.

Triggers

Try to use Constraints instead of triggers, like for referential integrity task using triggers is not a good option and that can be easily achievable by using constraints.

 

Triggers are more resource expensive and they work as integral part of transaction, which will makes transaction lengthy as well.

14.

Dynamic SQL Statements

Using Views, stored procedures are always beneficial over using dynamic SQL. And using dynamic SQL is an open invitation to SQL injection attacks.

 

And when there is no option left then we should make use of sp_ExecuteSQL instead of using EXEC command.

15.

Lack of Views / Stored Procedures

Accessing direct table is not a good practice in any engagement, and a developer should always access data through views and stored procedures.

 

SP also provides additional benefits like modular programming, security later, saves network traffic, and faster execution. SQL Server 2005 also introduces a statement level re-compilation, which is again very useful for bulky procedures.

16.

Use Distinct clause

Deleting duplicate data itself is a resource intensive operation, so making use of DISTINCT clause should be minimal in all queries.

17.

Missing SET NOCOUNT ON statement

Setting this option will stop sending the rows affected message to the calling client, which itself is a network intensive operation and moreover client applications doesn’t make use of those messages.

18.

Using UNION clause

Again using UNION will delete the duplicate rows in the result set, which is a resource intensive operation, so using UNION ALL is preferred way to doing the same thing.

19.

Multiple Network trips

Making use of stored procedures, views and small result set can save lots of network traffic.

 

Instead of sending a lengthy query to SQL server we should send procedure name with parameters to SQL Server and then get the smallest possible result set by using all possible options I explained here.

20.

Lack of useful data striping

Data partition is the very useful option to reduce the size of the table. We can do the same using horizontal or vertical partition.

 

This way we can save a lot of IO overhead and other related resources. There are various options available in SQL Server 2005 to do the same stuff and developers can also make their custom solutions based on business requirement.

 

Remember, small tables are always faster than bulky ones.

21.

Front-end Application issues like looping statements

Sometimes things are going well at SQL server end but because of some problem at front-end application we assume that SQL server is not doing his job.

 

So it’s always better to test the query at SQL Server end and then checking network resources before blaming poor SQL Server.

22.

Third party applications like anti-virus or some other services like IIS

Sometimes resource overhead because of other application running on same SQL Server is the main cause of poor performance.

 

So we should always check the memory and CPU utilization of other applications on the server.

23.

Slow network communication

Network bandwidth plays a very vital role in whole solution and this should be tested thoroughly in the solution.

24.

Inadequate memory in the server computer or not enough memory available for SQL Server

Setting appropriate settings for SQL Server is an art; SQL Server will NOT utilize all memory available until asked to do so.

 

Using 3GB, PAE, AWE options are the key to memory utilization for Microsoft SQL Server.

25.

High Processor usage by other applications

Sometimes resource overhead because of other application running on same SQL Server is the main cause of poor performance.

 

So we should always check the memory and CPU utilization of other applications on the server.

26.

Incorrectly configured system parameters

Setting SQL Server system parameters is a highly skilled job, and in some case it’s an art. Things works well on one server but this doesn’t out be true on another server because of changed business/technical requirement.

 

So it’s always better NOT to change any default option and even if we need to change some option then please read thoroughly about the option and test on development server before putting the same on production server.

27.

Disk bottlenecks

Disk IO is the key to any RDBMS application, because all we have is data and lots of data. So make sure you have used best possible RAID along with fastest possible Disk IO.

 

- Mohit Nayyar

 

 

 

Posted by mohitnayyar | with no comments

Recently I was proposing some of the high availability options for Microsoft SQL Server 2005, so thought of summarizing the same here....

SQL Server 2005 provides 4 basic options for high availability.

 

1.    SQL Clustering

2.    Database Mirroring

3.    Transactional Replication

4.    Log Shipping

 

Other than these 4 options some people also consider backup-restore and Database snapshots as other high availability options, which is really not true.....

 

"Database snapshots can be used as a recovery option from user/application specific errors or for reporting purposes when combined with other technologies. A snapshot is simply a read-only view of a database at a particular point in time, that’s why called - Database snapshots"

So let’s see all of them in more detail...

I will evaluate each high availability option on these parameters

 

1.

Failover type

Automatic or Manual

2.

Downtime period

Time to recover or get back the system in place

3.

Availability Scope

Database level or Server level

4.

Special hardware

Any special hardware required

5.

Standby Type

Hot / Warm

6.

Complexity

Configuration/maintenance complexity

7.

Accessibility

Is it possible to check the other server or can we make use of that anyway

8.

Client Access

Client redirection, Automatic or Manual

9.

Additional servers

Any additional Microsoft SQL Server required

10.

Distance limit

How far the servers can sit

11.

Automatic Role Change

Is it possible for the server to resume work automatically or need manual intervention

12.

Multiple storage location

Multiple data storage locations

13.

Hardware Setup Cost

How expensive the whole system will be

14.

SQL Server Editions support

What all options are available with each edition of Microsoft SQL Server 2005

 

 

Now lets take a closer look at the evaluating the different options…….

 

 

 

 

SQL Clustering

Database Mirroring

Transactional Replication

Log Shipping

1.     

Failover type

Automatic / Manual

Automatic / Manual

Manual

Manual

2.     

Downtime period

30 seconds

30 seconds

Manual effort required

Manual effort required

3.     

Availability Scope

Server

Database

Database / DB Objects

Database

4.     

Special hardware

Yes

No

No

No

5.     

Standby Type

Hot

Hot

Warm

Warm

6.     

Complexity

Complex

Little Complex

Little Complex

Easy

7.     

Data Accessibility (other than primary server)

No

Possible

Possible

Possible

8.     

Client Access

Automatic redirection

Automatic redirection

Manual effort required

Manual effort required

9.     

Additional servers (including base server)

At least >= 2 Servers

At least >= 3 Servers (automatic failover)

At least >= 2 Servers

At least >= 2 Servers

10.   

Distance limit

100 Miles

No Limit but depends upon network bandwidth

No Limit but depends upon network bandwidth

No Limit but depends upon network bandwidth

11.   

Automatic Role Change

Yes

Yes

No

No

12.   

Multiple storage location

No

Yes

Yes

Yes

13.   

Hardware Setup Cost

High, special certified hardware required

Medium

Low

Low

14.   

SQL Server Editions support

SQL Server Enterprise and Standard (2 node only)

SQL Server Enterprise and Standard (Restricted)

SQL Server Enterprise, Standard and Workgroup

SQL Server Enterprise, Standard and Workgroup

 

 

-       Mohit Nayyar