December 2007 - Posts

Using ...WHERE NOT IN (SELECT... with NULLs

I was asked to look at a query today and explain why using a NOT IN clause wasn't returning any rows. Switching the NOT IN to an IN resulted in 38 rows, and removing the clause completely returned 431 rows. So what was happening to the other 393? Surely if a row is not in IN, then it should be in NOT IN?

SELECT  C1
FROM    T1
WHERE   C1 NOT IN
       (
SELECT  C2
        FROM    T2)

T1 contained 431 rows, so without the WHERE clause these were all returned, as expected. T2 contained 38 rows where C2 = C1, so these were returned using the IN clause.

However, the query as written above returned no results. 

It turned out that there were NULL values for C2. Since NULL really means "unknown", SQL Server was unwilling to assume that any of the C1 values were NOT IN the sub-query results.

Although it makes sense when you think about it, I don't remember coming across the NULL = unknown issue in a NOT IN list before. Definitely something to consider when using NOT IN.

Louis Davison explains the problem in more detail.