Data is a huge part of what we do and I need someone that has a passion for data to lead our SQL team.

If you’ve got experience with SQL and want to lead a team working in an agile environment with aggressive CI processes.

Do you have a passion about data and want to use technology to solve problems then you are just the person I am looking for

The role is based in London working for on of the top tech companies in Europe.

Contact me though my blog or linkedin (http://uk.linkedin.com/in/simonsabin), if you are interested.

Posted by simonsabin | with no comments

Then contact me we need your services for project.

Use the contact form on this blog or contact me through linkedin http://www.linkedin.com/profile/view?id=2895653

Posted by simonsabin | with no comments
Filed under:

When you go shopping for a new gadget do you just go to your nearest electronics store (currys, best buy, etc) and read the marketing ticket next to the product?

No you probably don’t. You more than likely ask your friends if they’ve got one and what they think. Or you go and look for reviews that compare the different options.

The same applies to recruitment.

Employers don’t just rely on your cv (your marketing ticket), who knows what rubbish you’ve put in it. Employers want to validate that information.

Linkedin I find key to that validation process.

It takes 2 minutes to complete your profile and keep it up to date. Whilst it is similar to your CV it has a few benefits.

1. Its a very small world. I can see if you are connected to anyone I know or have worked with someone I know.

2. It validates what you have in your CV. Because its public you aren’t likely to exaggerate as much on Linkedin than you may do in your CV

3. You can get endorsements and recommendations from previous colleagues.

You wouldn’t included snippets of recommendations in your CV you can have them in your linkedin profile. Whats more, with the blandness of most references these provide some great information to base a decision on.

 

So make sure that you complete your profile.

Posted by simonsabin | 4 comment(s)

When I started looking at SQL conferences I had heard about this conference held in Denmark in the middle of nowhere with everyone staying in huts and was always interested as to how it worked, confining all the delegates in one place with no opportunity for escape. Later on when I became an MVP I started hearing stories about a conference in Denmark, the same one. It was the extreme of the conference work hard play hard ethic. The play hard being the legend. The huts are holiday homes and the conference venue is on the site of Legoland. On one of the nights they have a bbq and on the last night they have a dinner followed by the pool party. This isn’t any old pool party but here they take over a water park with a bar. You can imagine the scene, a conference of Oracle and SQL guys (a few woman), drinking and then going down water slides in groups of 4 on huge rubber rings.

Almost two years ago I was invited to speak at said event and unsurprisingly decided it was a must. The event was Miracle Open World 2011.

Now whilst the event and the beach party was something to behold, you try holding a beer whilst shooting down a vertical slide, my overriding memory was a talk I attended.

“You’ve got to go and see this guy. Last year he dissected the internals of a PDF document.”

This was a comment by on of the attendees about a session on dissecting mdf files.

Being a fairly hard core SQL guy I though this was going to be a poor show, mdf files are very complex. But I was intrigued as I was looking at how to explain to developers how SQL worked, and thats what the session promised.

Well I wasn’t disappointed, the session was by Mark S. Rasmussen, on his OrcaMDF project. Essentially he has written code that allows you to point it at an MDF file and it will reconstruct rows and columns from the data. Essentially he’s written the storage engine parser, that parses pages into the relevant information.

Even at that stage Mark had made huge progress and could different types of pages, have linked lists of pages and the allocation maps, decode page values into the column values. Not simple stuff by any stretch.

Mark is a seriously clever and nice guy. If you ever see him at a conference say hi and have a chat. He’s also got a fascinating past, more examples of your stupidly clever he is.

Since that talk Mark has been talking around the world (no idea how he finds the time, or the money to do it) and I’m really please today as I found out that Mark has been awarded an MVP for SQL Server. Fantastic and well deserved.

 

You can followed Mark on twitter http://twitter.com/improvedk or read his blog http://improve.dk

He has made the OrcaMDF project available on Github https://github.com/improvedk/OrcaMDF

This is where it started http://improve.dk/archive/2011/04/17/miracle-open-world-2011-follow-up.aspx

Posted by simonsabin | 1 comment(s)
Filed under: , ,

How many times have you copied data into a table and got one of the following less than helpful errors.

 

String or binary data would be truncated.

 

Arithmetic overflow error converting numeric to data type numeric.

 

Arithmetic overflow error for data type tinyint, value = 256.

 

Violation of UNIQUE KEY constraint '??????'. Cannot insert duplicate key in object '??????'. The duplicate key value is (3).

 

Cannot insert the value NULL into column 'parentId', table 'tempdb.dbo.child'; column does not allow nulls. INSERT fails.

 

The INSERT statement conflicted with the FOREIGN KEY constraint "?????". The conflict occurred in database "????", table "?????", column '?????'.

So they improved some of these messages to include the offending value. However that doesn’t help much when you’ve got a multi row insert. The error only occurs for the first offending value. For instance in the unique violation above you can resolve the problem by excluding the rows with a value of 3 but you might have others. In the example below we have others and so you have to put some generic code in to verify the data before you insert it.

What this means is that if you have a source of data to insert into a table you have to go through the following process.

1. Perform insert

2. Get failure message

3. Identify offending rows, correct or exclude them

4. Proceed to step 1 until you get no errors.

5. Once insert works, implement a process to identify bad rows prior to the insert to be corrected or post the insert to be verified and reprocessed.

 

I’ve worked through a very simple example below involving two tables “parent” and “child”. Each step resolves an error with the insert due to a data type domain constraint or explicit constraint. You can see that its a painful process.

The end result is a hugely complex query, with many sub queries and use of try_convert. Note that this code prior to SQL 2012 would have been amazingly complex to verify the data type values. The uniqueness check is the most complex due to having to check the combination of values from both the source and target tables.

The plan goes from a simple insert

image

to a much much more complex plan involving 2 additional scans of the source and 1 of the parent. Whilst those could be indexed to improve the plan it highlights additional work and indexes that aren’t required in the simple insert.

image

What’s immensely frustrating is that all these verification predicates are being done by the engine anyway, that’s how it generates the errors.

KISS

image

This is far from keeping it simple. On a recent project we’ve had to implement months of work to put this level of validation in. Whats worse is that you have to put all this code in up front. If you don’t then your application will jet fail and fall apart when you get a problem. What worse is the resolving of errors is hugely complex and time consuming.

I’d just accepted this as something you just have to live with. That was until now.

I was recently notified of a discussion on the TSQL forum and associated connect items.

Forum thread - http://social.technet.microsoft.com/Forums/en-US/transactsql/thread/3e17f8dc-9685-412b-8e76-94ad41536d5d 

Connect item https://connect.microsoft.com/SQLServer/feedback/details/774754/new-virtual-table-errors-it-would-analogous-to-the-deleted-and-inserted-tables#details

The idea here is that the errors get dumped into a table just like the output clause does. You can then examine the contents of the errors table and do something with it. This is very similar to the process seen in most ETL frameworks where rows that fail a lookup or a data type validation are put in an errors table  for later processing.

What a fantastic idea

 

Implementation

I would like the statement to not fail by default (maybe an option clause to configure this behaviour).

The biggest question is what error information is included. One column could have multiple validation errors, not null, check, foreign key etc. Ideally I would like to know all validation errors for each column but I can’t see that as workable, unless the result is a pivotted table i.e. one row per column and validation rule, but that would require a unique set of columns on the source to be defined somehow and would make it complex.

You would also have the situation that and error that fails one check is likely to fail lots more and so knowing that a column value has at least one error is good enough. That would need to have an error code column in the errors table per column.

The simplest would be to have a single error per row, thats what you get with the SSIS implementation and would probbaly be simplest to implement, as soon as a row fails its is marked as bad and not passed to the other parts of the plan but directed to the errors output.

I would settle for the 1 error per row. At least you know the row has failed, have caught the row, can reprocess the data easily.

Side note: What I found amusing during the process of building the script blow was that I was getting errors excluded but couldn’t find which source rows where causing the problem due to the vagueness of the failings. Exactly the problem that would be solved if we had an errors table.

 

VOTE

image

I think this idea is fantastic and urge you to vote on it if you agree.

https://connect.microsoft.com/SQLServer/feedback/details/774754/new-virtual-table-errors-it-would-analogous-to-the-deleted-and-inserted-tables#details

 

 

 

Sample script showing the ugly process of validating source data.

use tempdb

go

drop table child

drop table parent

drop table SourceTable

go

create table parent (parentId int not null primary key)

create table child (childid int identity(1,1), parentId int not null,strCol varchar(10),unqCol int unique, intCol int, numCol numeric(2,1),tinyintCol tinyint,  )

alter table child add constraint fk_child foreign key (parentId) references parent(parentId)

 

go

 

insert into parent (parentId)

values(1)

go

declare @bigint as bigint = pOWER(cast(2 as bigint),32)

select *

 into SourceTable

 from (

values

 (1,replicate('A',11),1,100,10.22,256)

,(1,replicate('A',10),1,100,10.22,256)

,(1,replicate('A',10),2,100,10.22,256)

,(1,replicate('A',10),3,100,1.2,256)

,(1,replicate('A',10),3,100,1.2,256)

,(1,replicate('A',10),3,@bigint,1.2,255)

,(1,replicate('A',10),3,@bigint,1.2,255)

,(1,replicate('A',10),3,@bigint,1.2,255)

,(1,replicate('A',10),4,100,1.2,255)

,(1,replicate('A',10),4,100,1.2,255)

,(null,replicate('A',10),9,100,1.2,255)

,(2,replicate('A',10),10,@bigint,1.2,255)

,(2,replicate('A',10),5,100,1.2,255)

,(2,replicate('A',10),6,100,1.2,255)

,(1,replicate('A',10),11,@bigint,1.2,255)

,(1,replicate('A',10),7,100,1.2,255)

,(1,replicate('A',10),8,100,1.2,255)

)  v(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

 

go

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

go

print ' '

print 'Exclude varchar values that will be truncated'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

 

print ' '

print ' Exclude numeric values that are too big'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

 

print ' '

print 'Exclude values too big for tinyint'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

 

print ' '

print 'Exclude values too big for int'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

and (intCol is null or try_convert(int,intCol ) is not null)

 

print ' '

print 'Exclude specific value that breaks unique constraint'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

and (intCol is null or try_convert(int,intCol ) is not null)

and unqcol <> 3

 

print ' '

print 'Exclude values that break unique constraint'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

and (intCol is null or try_convert(int,intCol ) is not null)

and not exists (select 1 from child  c where c.unqCol = SourceTable.unqcol)

and (select COUNT(1) from sourcetable s where s.unqCol = SourceTable.unqcol) <=1

go

print ' '

print 'Exclude values that break foreign key constraint'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

and (intCol is null or try_convert(int,intCol ) is not null)

and not exists (select 1 from child  c where c.unqCol = SourceTable.unqcol)

and (select COUNT(1) from sourcetable s where s.unqCol = SourceTable.unqcol) <=1

and (parentId is null or

  exists (select 1 from parent p where p.parentId = SourceTable.parentId))

go

print ' '

print 'Exclude values that breaks not null constraint'

insert into child(parentId, strcol,unqcol, intCol, numCol,tinyintCol)

select parentId, strcol,unqcol, intCol, numCol,tinyintCol

from SourceTable

where len(strcol) <= 10

and (numCol is null or try_convert(numeric(2,1),numCol) is not null)

and (tinyintCol is null or try_convert(tinyint,tinyintCol) is not null)

and (intCol is null or try_convert(int,intCol ) is not null)

and not exists (select 1 from child  c where c.unqCol = SourceTable.unqcol)

and (select COUNT(1) from sourcetable s where s.unqCol = SourceTable.unqcol) <=1

and (parentId is not  null and

  exists (select 1 from parent p where p.parentId = SourceTable.parentId))

Posted by simonsabin | 3 comment(s)

Bob Duffy is doing an afternoon seminar on Azure.

It will provide an introduction to the Azure platform, and in particular SQL Azure, show tools and methodologies to migrate on premise databases into the cloud, using a sample application and database and finally it will detail some of the Azure specific features that enable massive scale OLTP solutions such as federations.

http://www.prodata.ie/Events/2012/SQL-Azure_and_the_Cloud.aspx

Posted by simonsabin | 1 comment(s)
Filed under: ,

Its taken 30 days and I have managed, to my surprise, to grow a ginger Mo in support of Movember.

http://uk.movember.com/mospace/6154809

I didn’t quite reach my supposed lookaliki but I don’t think it was a bad effort. Next time I’ll leave my hair for a few months before.

imageimage

If you fancy donating then you can do so here https://www.movember.com/uk/donate/payment/member_id/6154809/ I only need £3 to reach £400 which would be great. The team have just passed £2000 which is awesome.

Posted by simonsabin | 1 comment(s)
Filed under:

If you are in Dublin then we have a meeting on tonight at the Microsoft office.

The subject of the talk is

“Almost all queries have to do two things, get data and join it together. In this session we will look at the aspects of these that most people think they know but in reality don’t. “

If you think you know SQL then you should come along and we'll see if you are right

http://www.mtug.ie/UserGroups/SQLServer/tabid/82/ctl/Details/Mid/413/ItemID/110/Default.aspx?ContainerSrc=%5BG%5DContainers%2F_default%2FNo+Container

Hope to see you there.

PS the Mo is growing nicely, if you fancy supporting Movember then you can do here https://www.movember.com/uk/donate/payment/member_id/6154809/

WP_000912

Posted by simonsabin | with no comments

I'll be over in Dublin doing a usergroup meeting on Thursday evening at the Microsoft office.

The subject of the talk is

“Almost all queries have to do two things, get data and join it together. In this session we will look at the aspects of these that most people think they know but in reality don’t. “

If you think you know SQL then you should come along and we'll see if you are right

http://www.mtug.ie/UserGroups/SQLServer/tabid/82/ctl/Details/Mid/413/ItemID/110/Default.aspx?ContainerSrc=%5BG%5DContainers%2F_default%2FNo+Container

Hope to see you there.

Posted by simonsabin | 1 comment(s)

I’m growing my first ever moustache for Movemb http://uk.movember.com/

You can see my mo space and pictures of the thing growing on my lip here http://uk.movember.com/mospace/6154809

If you would like to support me then please do make a donation and make this worth it.

https://www.movember.com/uk/donate/payment/member_id/6154809/

Were not even half way through the month and not sure how I’ll get to the end. Please contribute to keep me motivated not to slip with the razor.

 

If anyone knows where I can get a brown suede suit I’m looking for doing impressions of Donald Sutherland in Animal house.

Check out my page http://uk.movember.com/mospace/6154809 to see how I’m progressing.

Posted by simonsabin | 1 comment(s)
More Posts Next page »