Data type conversions can often present problems in a database, specifically when converting from character to numerical data. I’m not talking about precision or rounding problems, not this post anyway, I’m talking about situations where your queries work fine and then one day without change to your query it just stops working with an error like this one...
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'T' to data type int.
What is going on? How can a query work fine and without any changes it suddenly stop working and give an error? Well, query plans, essentially your query plan is built based on statistics such as the amount of data in the table, cardinality of columns etc... As your data grows your query plans may change. Let’s see the behaviour I describe.
Firstly, our example setup; this creates a small table and populates it with 52 rows. The idea here is that the last character on the column named acharcolumn is actually suppose to be a number, in a properly designed database you would have probably a) broken it out into its own column or b) have some form of check expression to make sure the character is numeric.
create table #datatype_example (
cat int not null,
acharcolumn varchar(5) not null,
subcat int not null
)
set nocount on
declare @i int
set @i = 1
-- Bad data one...
insert #datatype_example ( cat, acharcolumn, subcat ) values( 0, '10cT', 1 )
while @i <= 50
begin
insert #datatype_example ( cat, acharcolumn, subcat ) values( @i, '20c2', @i % 5 )
set @i = @i + 1
end
go
create index nc1 on #datatype_example( cat, acharcolumn )
go
Now we have our table set up, let’s try our query; executing the query below yields one row.
select acharcolumn
from #datatype_example
where cat <= 1000
and right( acharcolumn, 1 ) = 1
and subcat = 3
go
Now, our table grows and let’s see what happens.
declare @i int
set @i = 51
while @i <= 50000
begin
insert #datatype_example ( cat, acharcolumn, subcat ) values( @i, '20c2', @i % 5 )
set @i = @i + 1
end
go
Run exactly the same query as before, this time we get an error!
select acharcolumn
from #datatype_example
where cat <= 1000
and right( acharcolumn, 1 ) = 1
and subcat = 3
go
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'T' to data type int.
We can ‘fix’ the problem by creating an index, but how long until we get the error again?
create index nc2 on #datatype_example ( subcat, cat, acharcolumn )
go
select acharcolumn
from #datatype_example
where cat <= 1000
and right( acharcolumn, 1 ) = 1
and subcat = 3
go
drop table #datatype_example
Ok, so what’s going on, what’s the recommendation on this?
It’s all down to access paths and which expressions in the WHERE clause are evaluated first; the WHERE clause is suppose to evaluate all at once which is why the position of expressions on the WHERE has no effect, also using brackets has no effect on the access paths chosen – using brackets does have effect on evaluation order though – but, the data will have been got by then. Note, two things to remember here – access paths is the means by which you get at the data you want (the index chosen for instance) and the evaluation order is like it says on the tin – the order the expressions (logic) on the WHERE clause is evaluated.
The evaluation order is fixed and can never change; however, the access paths can change every time the query plan is compiled.
Going back to our examples, the reason the first query worked is because a) the optimiser chose a table scan and b) rearranged our WHERE clause so that the highest cost items where done last, so it becomes...
where cat <= 1000
and subcat = 3
and right( acharcolumn, 1 ) = 1
Makes sense when you think about it, why evaluate the right() for each row, instead, find all the rows where cat <= 1000 and then the rows in that where subcat = 3 and for the small subset of rows do the right function.
Introducing more data changes the plan, this time the index nc1 is used, remember the index is on cat, acharcolumn so our where clause now looks somewhat different, it actually gets executed in this order...
where cat <= 1000
and right( acharcolumn, 1 ) = 1
and subcat = 3
Hence our problem! The rows where cat <= 1000 are found and then the right() applied, only – this is where one of the rows has a non-numeric character that fails, because converting that character ‘T’ to integer – which is what SQL Server implicitly does because ‘= 1’, the 1 is an integer then bang – failure.
What’s the answer?
Be very careful when doing data conversions from character to numeric, avoid them in queries if possible.
Our query can be rewritten..
select acharcolumn
from #datatype_example
where cat <= 1000
and right( acharcolumn, 1 ) = ‘1’
and subcat = 3
There are a whole raft of data conversion problems in SQL Server, especially around precision and rounding, truncation etc... Be aware of them and test your stuff!
One last thing, remember - these problems can be intermittent, I've got a half written article on parameter sniffing and compiled plans; basically - if your plan is cached the session that caused the compilation may have used parameters that give a query plan that does not give your error, however, that same query (stored proc for instance) could be executed if a different set of parameters creating a different plan which causes the error. Don't get too eat up on that, its for another post.