I love PowerBI, actually I love PowerQuery. It's a great way to combine data from around your business, across the web from anywhere. What I really like is very little is done automatically, i.e. it doesn't do the nice data type detection you get with Excel that screws your imports if the data type in a column differs from the first few rows to the rest of the file.

Does that make it difficult to use. No its not. The nice thing is that its very easy to add additional columns, change data types, use expressions, combine datasets, and do much much more.

If you've ever used SSIS you will often find that the UI gets in the way of the experience. SSIS is really focused on a high performance pipeline. This bubbles far too much to the surface with the buffers and column mappings.

Lets walk through a simple example, of splitting a column that contains filename into the file and the extension

In SSIS

  1. Add a derived column component to your data flow
  2. Delete the existing connection between two data flow components
  3. Connect the upstream component to the derived column component
  4. Connect the downstream component to the derived column component
  5. Correct any mapping you had defined on the downstream component
  6. Open the derived column component
  7. Add a row for the new column
  8. Try and enter your expression in the single text box (or have downloaded BidsHelper to give you the expression editor ) to find the filename
  9. Make sure the length for the new field is going to be big enough
  10. Do the same to get the file extension
  11. Get some sample data
  12. Put in a data viewer
  13. Run the data flow to check the split is working correctly
  14. Click ok

How about in Power Query

The wizard way (picture left)

  1. Click on the column you want to split
  2. Click the split column in the toolbar
  3. Specify the delimiter and how you want it split
  4. Click ok
  5. View the split is doing what you want

The wizard way (2)

  1. Click Add Custom Column
  2. Put as the function (Column1 is the source column and Source is the source table, Column1.1 and Column1.2 are the new column names"

Table.SplitColumn(Source,"Column1",Splitter.SplitTextByEachDelimiter({"."}, null, true),{"Column1.1", "Column1.2"})

  1. Click Ok

The non wizard way

  1. Open the Advanced editor for the query
  2. Add a new line where you want to add the column in the steps
  3. Add

SourceWithSplitColumn = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByEachDelimiter({"."}, null, true),{"Column1.1", "Column1.2"})

  1. Change the line that was referencing the old variable to point to the new variable
  2. Click OK and view the changes

 

The real power query is that everything is just code, you don't have to deal with some bad UIs, the fall back is always a line of code. With SSIS if something is awful in the UI you had to resort to the advanced editor and that only worked if the component hooked everything up properly.

I was amazed at the library of functions available, json parsing, xml parsing, download from web, from sql and much much more. Discussing this with Matt Masson at ignite this month it all made sense ALL components are built using the M Language. If they need to do something that they can't do with the language they consider extending the language. Adding more functions, which means its then available to all.

Not seen Power Query, where do you start.

Well you can just jump create some queries, use the wizards and see whats generated, quickly you'll want to be jumping into the advanced editor (well I was) and with no intellisense you will stumble.

So once you've found your way around maybe watching some videos looking at samples.

https://www.sqlbits.com/Content/?type=3&tags=Power+Query

You really need to understand the basics of the language. I'd read the first few sections of the language specification (first item below) this explains syntax, and is really useful. Then there are all the functions you will want they are in the next 2.

Happy reading

 

 

Posted by simonsabin | 1 comment(s)
Filed under: , , ,

I was very proud to be asked to speak at Ignite at the start of May. http://ignite.microsoft.com/speaker/speakermoreinfo/?speakerid=f0dfadd0-2498-e411-b87f-00155d5066d7.

I encounter much scepticism about continuous delivery for databases, more over many talks about continuous delivery invariably stumble when databases are mention. This leads developers to often consider schema less solutions.

The talk goes through the reasons why anyone building software in today’s world should be using CI  and CD. Whats more I focus on two of the common misconceptions about CD.

First that of security and compliance. How can you have an automated process for deploying your database and meet your security and compliance needs, I show how the opposite is probably more the case. How can you possibly have a compliant and secure database deployment process that isn’t continuous from source through to deployment.

Secondly that of it not being possible. Many turn around and say that automation of database deployments isn’t possible. Many reasons are touted, the tools don’t support feature x, its too slow, we need control over X, we need the upgrade to not take the database offline. Well in the talk,

1. I’m using the encryption functionality in Azure SQL Database that were only announced at Ignite, and thus the tooling doesn’t support it.

2. I do a large refactor of a table that replaces a table with a view and a trigger along with population of new columns

3. All is done whilst an application is hitting the database and it continues with no problems.

As well as the azure preview features the talk is built around Visual Studio Online and Release Management which provides the source control, review, build, test and deployment mechanisms and a workflow for the whole process along with approval at each stage.

In the session I do 3 full deployments to dev and production through the full lifecycle from checkin through to deployment. Given that, the use of azure and the use of preview features I was very glad when all the demos worked.

If you are in any doubt that CD is possible for databases then watch the video.

http://channel9.msdn.com/Events/Ignite/2015/BRK3552

You can find the powerpoint by registering on Yammer and going to https://www.yammer.com/microsoftignite/#/Threads/show?threadId=537321578

Enjoy

Anyone working on SQL Server will have banged their head against a wall at one point or another trying to figure why a query hitting millions of rows isn’t being done in parallel.

Well there have been a number of blogs posts about this, Craig Freedman’s excellent series on the query engine being the first http://blogs.msdn.com/b/craigfr/archive/2007/04/17/parallel-query-execution-presentation.aspx

With the amazing Paul “query optimiser for a brain” White has a further list http://sqlblog.com/blogs/paul_white/archive/2011/12/23/forcing-a-parallel-query-execution-plan.aspx

There are some details in MSDN/BOL although these pages appear to have been removed for SQL 2012 and 2014 in the redesign of BOL :(

https://technet.microsoft.com/en-us/library/ms178065(v=sql.105).aspx

https://technet.microsoft.com/en-us/library/ms188611(v=sql.105).aspx

Nicely in SQL Server 2012 a reason was added to the execution plan, but the list of reasons wasn’t document. Well here it is

      {
      L"MaxDOPSetToOne"                                                       // DOP configuration option is set to 1
      L"NoParallelPlansDuringUpgrade"                                         // no parallel plans during upgrade
      L"ParallelismDisabledByTraceFlag"                                       // the 8687 traceflag is on to disable parallelism
      L"EstimatedDOPIsOne"                                                    // the estimated DOP is 1
      L"NoParallelDynamicCursor"                                              // no parallel plans for dynamic cursor
      L"NoParallelFastForwardCursor"                                          // no parallel plans for fast forward cursors
      L"NoParallelCursorFetchByBookmark"                                      // no parallel plans for cursors that fetch by bookmark
      L"NoParallelCreateIndexInNonEnterpriseEdition"                          // no parallel index creation in non-enterprise edition
      L"NoParallelPlansInDesktopOrExpressEdition"                             // no parallel plans for desktop or express editions
      L"NonParallelizableIntrinsicFunction"                                   // a non-parallelizable intrinsic function
      L"UpdatingWritebackVariable"                                            // no parallel plans while updating a writeback variable
      L"CLRUserDefinedFunctionRequiresDataAccess"                             // CLR UDF requires data access
      L"TSQLUserDefinedFunctionsNotParallelizable"                            // T-SQL Udfs are not parallelizable
      L"TableVariableTransactionsDoNotSupportParallelNestedTransaction"       // Table variable transactions do not support parallel nested transactions
      L"DMLQueryReturnsOutputToClient"                                        // DML query returns output to client, which forces to yield the transaction and doesn't work with parallel plans
      L"MixedSerialAndParallelOnlineIndexBuildNotSupported"                   // Query Execution does not support mixed serial and parallel plans for a single online index build tree
      L"CouldNotGenerateValidParallelPlan"                                    // Verifying parallel plan failed and we had to fall back to serial
      L"NoParallelForMemoryOptimizedTables"                                   // no parallel plans for Hekaton/In-memory tables
      };

If you are smart you will see NonParallelizableIntrinsicFunction. This hides a multitude of functions like the encryption ones that it would be great to enable or at least document. I’m working on it but I understand its a fairly long list, Paul’s blog has some details.

Posted by simonsabin | 3 comment(s)

This is a quick note to myself that to download the publish settings file go to

https://manage.windowsazure.com/PublishSettings/index?Client=&SchemaVersion=&DisplayTenantSelector=true

or

call the powershell cmdlet

Get-AzurePublishSettingsFile

 

On many sites the link is to he following but that seems to be old and not work.

Link that doesn’t work https://windows.azure.com/download/publishprofile.aspx

Posted by simonsabin | with no comments
Filed under: ,

Azure for storage of backups is great its cheap and you can easily store them compressed and encrypted. If bandwidth isn’t an issue away you go.

Microsoft introduced Backup to URL for SQL 2014 and back ported it to SQL 2012 SP1 CU2. This is great as you can just run your backup and they get uploaded direct to azure.

If however you are using SQL 2012 prior to SP1 CU2 or a version of SQL prior to SQL 2012 then you aren’t lost.

Microsoft have developed a tool (Microsoft® SQL Server® Backup to Microsoft Azure®Tool) that works for all versions of SQL Server.

It installs a filter driver (like many of the backup compression vendors do) that intercepts requests for backup files and streams them to Azure Blog storage.

Whilst it does what it does its not exactly intuitive. You install it and configure it and then think what next. Thats because the UI is just for configuring the folders that the tool will “filter” on and redirect requests to azure. Once you’ve done that you then run a normal backup or restore, well sort of. The tool doesn’t give you an explanation and so I’ve provided you one here which should explain all you need.

The crux of how this works is the creation of two files, a meta data file and a contents file. You need the meta data file to be able to do the restore. This file is titchy and points to the main data file

{
  "Version": 1,
  "Azure": {
    "DataBlobName": "master-40fdcc93-fe60-4a43-b2e1-8330e64dbbf9.bak.enc",
    "ManifestBlobName": "master-40fdcc93-fe60-4a43-b2e1-8330e64dbbf9.bak",
    "AzureStorageAccount": "--------",
    "AzureContainer": "------"
  },
  "Local": null,
  "FileSize": 3194880,
  "CompressedSize": 418288,
  "Trailer": "QZ0jQWlF4ymCLWn5OZJDoPRsWVQ0RPA1h2FywEvjwyX3n. . . Vh4OI2nizLgPC//OhR1IXmXhUFC0yIpuP1bDs=",
  "Compressed": true,
  "DataEncryptionDetails": {
    "Salt": "2ETZxxZZIhwbyyJfYvtJZiRyjQ2E9FXT5wCnanVcCcI=",
    "Iv": "fUGUKTOPRewwmgZYn1tTqQ==",
    "Mac": "qg3ziLsf9nbOc2CbRhN3ZbsWqBMeTl4wkFzD1Gwl/e8="
  },
  "TrailerEncryptionDetails": {
    "Salt": "R5Py4HN0ui+VkXCRiu2+NXVKyUtrBAzsuT/nZ6aRkfU=",
    "Iv": "p6Vf8h3SpRZPfCVRDJI60Q==",
    "Mac": "g8e2tHYU3Ld0Y+Wn69LE7+Ewc6xb2/GfQvGCk/2uvS8="
  }
}

Backup to azure

  1. Configure a folder in this tool that SQL has access to be monitored by SQLBackup to Azure
  2. Now go in to SQL Server and run a backup as normal to the folder and it will be backed up to Azure

SQL will create 2 files,

    1. The first file with the name you specified. This contains the meta data of where the file will be backed up to in azure (but not with any credentials)
    2. The scond file contains the data and will have the same name with a suffix to match the type of backup,
      • non compressed and non encrypted = .data
      • compressed and non encrypted = .gz
      • compressed and encrypted = .enc

Restore a backup to another machine

1. Configure a folder in this tool with the correct blog storage account and credentials (and password if using encrypted backups)

2. The meta data file

Download the file with the original name (i.e. no 1 from above) from blob storage to the folder you have configured in step 1.

3. Run your restore specifying the path to the file in the folder in step 2.

SQLBackupToAzure will intercept the request and stream the data from azure to SQL Server.

Posted by simonsabin | 1 comment(s)

I’m very excited by to see the pubic preview of the Azure Key Vault service.

A blog has been started to cover the technical aspects and can be found here http://blogs.technet.com/b/kv/

One of the only things I’ve been speaking about in conferences recently has been compliance and encryption. https://www.sqlbits.com/Sessions/Event12/Achieving_PCI_with_SQL_Server-Dont_be_the_next_Sony

The most complex part of the the demos I do in that session is the key management story. Key management in SQL is possible but its complicated because keys are managed in the same store as the data and you have the challenge that a sysadmin has access to everything in SQL Server including the keys. That means duality is very difficult.

I’m working on a demo for how to use the Azure Key Vault instead of the using SQL Servers key management so keep posted.

If you are interested in trying it your self you can look at the technet documentation “Extensible Key Management Using Azure Key Vault (SQL Server)

Posted by simonsabin | 1 comment(s)

Was today directed to a nice little site giving the differences between different database systems. Its a nice succinct table of features

http://www.sql-workbench.net/dbms_comparison.html

There is however an issue having yes and no answers to many of the features. A lot of them are grey, i.e. in SQL server you don’t have before triggers but you have instead of triggers.

The problem with this is also it gets out of date quickly. Although the fact that SQL Server has had user defined types based on CLR for almost a decade suggests the writer isn’t a SQL Server person.

It also believes that you can’t return the results of an OUTPUT clause as a result set.

So as with much content on the web, take it as guidance but if you are really looking for a feature then go check the relevant forums/official documentation.

Posted by simonsabin | 2 comment(s)

There are always those problems that you hit and you feel like you are hitting your head against a wall. This was one of those situations.

Running builds in TFS and we were wanting to update the build version and checkin the result.

Issuing two simple commands to checkout and checkin the file were failing with

“Unable to determine the workspace. You may be able to correct this by running 'tf workspaces /collection:TeamProjectCollectionUrl'.”

Now I know that TFS doesn’t work well if the workspaces are screwed and so started looking into the problem.

Firstly we had just upgraded to TFS 2013 and a new TFS server. This has all sorts of problems with workspaces and solution files as the source control settings are cached and also in the solution and project files. Agghhhh where’s git when you want it.

The caching is easily solved by removing the cached files this can be done by removing

“%LOCALAPPDATA%\Microsoft\Team Foundation\<version>\Cache”

Where <version> is the version of TFS you are using TFS 2013 has a version of 5.0.

Note: you won’t be able to remove the folder if VS is running (on a dev machine) or the Team Foundation Server Build Service Host is running (on a build machine)

I tried this and still no luck.

After some bing juice, I came across some comments in http://blogs.msdn.com/b/granth/archive/2008/07/22/tf-exe-unable-to-determine-the-workspace.aspx that alluded to the PATH pointing to the wrong version of TF.exe

I changed my script to point to the explicit TF.exe and hey presto it worked.

On examining the PATH environment variable I saw that indeed it was pointing to the previous version of TFS tools.

“C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE”

Not sure why the install of TFS 2013 didn’t change the path, probably some back compatibility stuff. Suffice to say, what out if you have multiple versions of VS and TFS installed on your build and dev machines.

Are you trying to build a SQL Server database project and getting
 
The imported project "C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v10.0\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets" was not found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.
We had this recently when trying to build one SSDT solution but not when building another.
 
Checking the build agent the error was correct that file didn’t exist. But later versions did.
 
A nice feature of TFS is that on clicking the error in build it takes you to the exact point in the sqlproj file that the error relates to.

<

Import Project="$(MSBuildExtensionsPath)\Microsoft\VisualStudio\v$(VisualStudioVersion)\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets" />
You will see that the path is made up of a number of components. Importantly $(VisualStudioVersion) defines the folder it looks for the targets file in.
 
Where is this property set?
 
We did some digging and found that the solution files differed.
 
The one that worked looked like this
Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 2012
and the one that failed looked like this,
Microsoft Visual Studio Solution File, Format Version 11.00
# Visual Studio 2010
As I#m sure you’ve done we put 2 and 2 together and figured that $(VisualStudioVersion) was being set by VS based on the solution file version.
We changed the solution file that wasn’t working to point to Visual Studio 2012 and hey presto it all worked.
 
On putting this blog post together I did a bit of binging and you will find this http://blogs.msdn.com/b/webdev/archive/2012/08/22/visual-studio-project-compatability-and-visualstudioversion.aspx which explains exactly the behaviour.
 
This means that if you have multiple versions of Visual Studio installed then you could end up with odd behaviour with automated builds.
 
There are a few things that are bad IMHO
1. there is no way in the IDE (that I can find) to see that the solution is targeted to Visual Studio 2010 and not Visual Studio 2012
2. no way to change it without editing the solution
3. the fact this property is in the solution and not the project.
 
Options you have as described in the blog post
 
1. Put <VisualStudioVersion>11.0</VisualStudioVersion> in a property group in your msbuild file
2. Put <VisualStudioVersion>11.0</VisualStudioVersion> in a property group in your sqlproj file
3. Pass as a property when calling msbuild  i.e. /p:VisualStudioVersion=11.0
4. Create a new solution with a header matches the version of VS you want to use
5. Change the solution so the header matches the version of VS you want to use
Posted by simonsabin | 3 comment(s)

Are you looking to go to SQLBits on the cheap. If so then we have arranged some great value accommodation.

£39 per night includes a full english breakfast at the Priorslee Rooms which is a 5 minutes from the venue.

For more details on accommodation options available visit the SQLBits accommodation page

Posted by simonsabin | with no comments
More Posts Next page »