in

SQL Server Blogs

Voices from the UK SQL Server Community

Atul Thakor

October 2010 - Posts

  • Multiple Phone Number Matching

    Hi All,

    So here goes my first post (Thank you Simon with resolving the initial issues I had!).

    I will be eternally grateful to anyone who knows how to paste T-SQL directly into the blog without needing to manually remove the double line spacing!

    I recently did some work involving matching records based on a telephone numbers to find duplicates!

    Here's a brief of the problem with my solution but I would be really curious to know if you guys have any alternative solutions for this problem.

    You start with a table which contains the following:

    ID

    PhoneNo1

    PhoneNo2

    PhoneNo3

    1

    00001

    00002

    00003

    2

    00004

    00005

    00006

    3

    00007

    00001

    00001

    The aim is to compare all records to find matches on phone numbers where any phone number for a record could match any of the phone numbers on another record. So for example ID's 1 and 3 will match because Phone No '00001' is PhoneNo1 on ID 1 and PhoneNo2 on ID 3.

    Initially I tried things like cross joins which didnt work out, the final solution used UnPivot to get the data into a single list which I was able to work with to find duplicates and remove them before identifying matches.

    You should be able to paste the code and run it without any issues (skip the drop's on the first run),

    Cheers

    drop table Numbers
    drop table Numbers_Unpivot

    Create Table Numbers
    (
    ID int identity(1,1),
    PhoneNo1 varchar(10),
    PhoneNo2 varchar(10),
    PhoneNo3 varchar(10)
    )

    insert into Numbers

    select '00001','00002','00003'
    UNION
    select
    '00004','00005','00006'
    UNION
    select
    '00007','00001','00001'

    Create Table Numbers_UnPivot(
    ID int,
    PhoneNo varchar(10),
    PhoneType varchar(10)
    )

    --Unpivot Records

    insert into Numbers_UnPivot
    SELECT
    ID,
    Num,
    ColName
    FROM
    (
    SELECT
      ID,
      PhoneNo1,
      PhoneNo2,
      PhoneNo3
    FROM
     
    Numbers
    ) AS piv
    UNPIVOT
    (Num FOR ColName
    IN (PhoneNo1,PhoneNo2,PhoneNo3)) AS s
    go

    --Remove Duplicates
    with CTE_Duplicates
    As
    (
     
    select
     
    ID,
     
    Row_Number() OVER(PARTITION BY ID,PhoneNo order by ID,PhoneNo) as RN
    from
      Numbers_UnPivot
    )

    delete
    from
      CTE_Duplicates
    where
     
    RN = 2
    go

    --Group up the remainder

    with CTE_Multiples
    As
    (
     
    select
     
    PhoneNo
     
    from
     
    Numbers_UnPivot
     
    group by
     
    PhoneNo
     
    having COUNT(*) > 1
    )

    select *
    from
     
    Numbers_UnPivot
    where
     
    PhoneNo in(
     
    Select
     
    PhoneNo
     
    from
     
    CTE_Multiples)

     

Powered by Community Server (Commercial Edition), by Telligent Systems