in

SQL Server Blogs

Voices from the UK SQL Server Community

Atul Thakor

Introducing the Excel Fuzzy Matching Lookup Add-In

My job involves a fair amount of fuzzy matching, when working on the analysis for a project I’ll often use excel as it does everything I need but its also great as you can send spreadsheets to the business knowing they’ll be comfortable using it and fairly self sufficient.

This fuzzy matching Add-In from Microsoft Research really caught my eye, as it meant I could take sample records e.g.: Name1 and Name2 and run a rough estimation for how accurate the matching should be.

The setup

Firstly you need to have Excel 2010 installed, the add-in can be downloaded from here, unzip and click the usual next buttons.

Here is a simple example to get you started:

  1. Create a new spreadsheet and paste the following in A1
    New Customer Existing Customer
    Jimmy Paige Jimmy Page
    Ladee Gaga Lady Gaga
    Jimmi Hendriks Jimmy Hendricks
  2. Fuzzy lookup needs tables to compare values so…
  3. Highlight all the rows(not entire column) in column A and hit Control  + L this creates a table within excel for that column, ensure you tick the “My table has headers”
  4. Highlight the rows(not entire column) in column B and hit Control + L and tick the headers column, you now have the 2 tables which you are ready to compare.
  5. Your screen should look like this
    image
  6. Now click the fuzzy lookup link in the toolbar and then click the fuzzy logic button, a menu should appear on the right
    image
  7. At the top the 2 tables which are to be compared are chosen, left column and right column are pre-selected, in the match columns the columns to be compared have already been selected, if not click the compare button which will add them to the match columns list.
  8. You can then select the type of comparison, the default is text, other options are Zip code, Phone Number or Social Security, you can define your own by clicking configure (I’ve yet to do this)
  9. Hit ‘Go’ and you should see the following:
    image
  10. As you can see we have 2 matches with similarity ratings,

Issues

I’ve found the add-in to be pretty unstable ranging from running forever to not running at all, to run a new comparison I usually restart excel!

Don't let this put you off!

Comments

No Comments

About AtulThakor

Twitter:@AtulThakor
Powered by Community Server (Commercial Edition), by Telligent Systems