in

SQL Server Blogs

Voices from the UK SQL Server Community

Atul Thakor

Multiple Phone Number Matching

Hi All,

So here goes my first post (Thank you Simon with resolving the initial issues I had!).

I will be eternally grateful to anyone who knows how to paste T-SQL directly into the blog without needing to manually remove the double line spacing!

I recently did some work involving matching records based on a telephone numbers to find duplicates!

Here's a brief of the problem with my solution but I would be really curious to know if you guys have any alternative solutions for this problem.

You start with a table which contains the following:

ID

PhoneNo1

PhoneNo2

PhoneNo3

1

00001

00002

00003

2

00004

00005

00006

3

00007

00001

00001

The aim is to compare all records to find matches on phone numbers where any phone number for a record could match any of the phone numbers on another record. So for example ID's 1 and 3 will match because Phone No '00001' is PhoneNo1 on ID 1 and PhoneNo2 on ID 3.

Initially I tried things like cross joins which didnt work out, the final solution used UnPivot to get the data into a single list which I was able to work with to find duplicates and remove them before identifying matches.

You should be able to paste the code and run it without any issues (skip the drop's on the first run),

Cheers

drop table Numbers
drop table Numbers_Unpivot

Create Table Numbers
(
ID int identity(1,1),
PhoneNo1 varchar(10),
PhoneNo2 varchar(10),
PhoneNo3 varchar(10)
)

insert into Numbers

select '00001','00002','00003'
UNION
select
'00004','00005','00006'
UNION
select
'00007','00001','00001'

Create Table Numbers_UnPivot(
ID int,
PhoneNo varchar(10),
PhoneType varchar(10)
)

--Unpivot Records

insert into Numbers_UnPivot
SELECT
ID,
Num,
ColName
FROM
(
SELECT
  ID,
  PhoneNo1,
  PhoneNo2,
  PhoneNo3
FROM
 
Numbers
) AS piv
UNPIVOT
(Num FOR ColName
IN (PhoneNo1,PhoneNo2,PhoneNo3)) AS s
go

--Remove Duplicates
with CTE_Duplicates
As
(
 
select
 
ID,
 
Row_Number() OVER(PARTITION BY ID,PhoneNo order by ID,PhoneNo) as RN
from
  Numbers_UnPivot
)

delete
from
  CTE_Duplicates
where
 
RN = 2
go

--Group up the remainder

with CTE_Multiples
As
(
 
select
 
PhoneNo
 
from
 
Numbers_UnPivot
 
group by
 
PhoneNo
 
having COUNT(*) > 1
)

select *
from
 
Numbers_UnPivot
where
 
PhoneNo in(
 
Select
 
PhoneNo
 
from
 
CTE_Multiples)

 

Comments

 

eeStimpson said:

select

ID = n2.ID

, PhoneNo = coalesce

(

case when n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo1 end

, case when n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo2 end

, case when n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo3 end

)

, PhoneType = coalesce

(

case when n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo1' end

, case when n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo2' end

, case when n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo3' end

)

from

Numbers n1

join Numbers n2 on

n2.ID != n1.ID

and

(

n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

or

n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

or

n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

)

October 27, 2010 4:10 PM
 

eeStimpson said:

or:

select

IDa = n1.ID

, PhoneNoa = coalesce

(

case when n1.PhoneNo1 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then n1.PhoneNo1 end

, case when n1.PhoneNo2 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then n1.PhoneNo2 end

, case when n1.PhoneNo3 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then n1.PhoneNo3 end

)

, PhoneTypea = coalesce

(

case when n1.PhoneNo1 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then 'PhoneNo1' end

, case when n1.PhoneNo2 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then 'PhoneNo2' end

, case when n1.PhoneNo3 in (n2.PhoneNo1, n2.PhoneNo2, n2.PhoneNo3) then 'PhoneNo3' end

)

, IDb = n2.ID

, PhoneNob = coalesce

(

case when n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo1 end

, case when n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo2 end

, case when n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then n2.PhoneNo3 end

)

, PhoneTypeb = coalesce

(

case when n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo1' end

, case when n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo2' end

, case when n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3) then 'PhoneNo3' end

)

from

Numbers n1

join Numbers n2 on

n2.ID > n1.ID

and

(

n2.PhoneNo1 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

or

n2.PhoneNo2 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

or

n2.PhoneNo3 in (n1.PhoneNo1, n1.PhoneNo2, n1.PhoneNo3)

)

October 27, 2010 4:18 PM
 

AtulThakor said:

Thanks for the alternative solution eeStimpson, I like the second version where you can get the results on a single line.

The other thing.... you can run your version regardless of the sql version too!

October 27, 2010 6:17 PM
 

GrumpyOldDBA said:

I use livewriter with a plugin which pastes visual studio code -  it preserves formatting and colours correctly. Out of interest the annoying double spaces put in by office 2007 is resolved in office 2010.

You stand a much better chance of getting a post somewhere near how you want it with livewriter. I've not tried any other products

October 28, 2010 11:18 AM

About AtulThakor

Twitter:@AtulThakor
Powered by Community Server (Commercial Edition), by Telligent Systems