SQL Server 2008 – iFTS Performance – DocidMap

The next post in the series on iFTS (Integrated Full Text Search) covers one of the major improvements in SQL Server 2008 for Full Text and is related to the Integrated part or iFTS.

SQL Server 2005 introduced a huge performance improvement by moving the location of the docid map.

So what is a docid, you may ask?

Well we all know that to get best performance with indexes our keys should be as small as possible. When you create a Full Text index you specify a unique index to be used as the key for the row. However this might be anything, a uniqueidentifier, a large varchar, anything, so to ensure optimal performance uses a surrogate key to identify a row in a table. This is the docid (Document Id).

Note: In full text terms a document is a row in a table.

There is obviously a need to translate between the docid and the key of the table. Prior to SQL 2005 that was done in the full text engine, this meant two things. Large keys had to be passed through OLEDB to SQL Server and also unnecessary mapping may have undertaken in the Full Text service.

In SQL 2005 this mapping was moved into the database which meant the transfer over OLEDB was quicker and the engine could do more optimisations in terms of doing the mapping.

So what’s new in SQL 2008?

Firstly the index is stored in the engine and so there is no OLEDB stuff going on, but more importantly if you have an integer key for your Full Text index, Full Text will use that key as the docid and so you won’t have to map between two values.

This has huge benefit when you have full text and normal relational filters because the optimiser can push the relational filter into the querying of the full text index and thus read less data and so perform better.

The following query plans highlight the docid mapping. The first is using a uniqueidentifier as a key,

And this shows the query plan with an integer key,

The subtree cost of the first is 0.01 (which isn't high but I only have a 174 rows in this table. The subtree cost of the second is 0.000459, thats a 20 fold decrease. Just for using an integer key. All is not lost, if you have a non-integer key you still benefit from the performance changes and the ability to use all the relational join operators in the engine, its just you will have a few extra steps to perform and so performance will not be as good as with an integer key.

What about upgrade?

If you are upgrading from SQL 2005 you have a number of options. The first is to maintain the index as is (but held within SQL Server). This has the benefit of reduced downtime, but you still have a docidmap.

The second option is to rebuild the index. This will rebuild the index using the new structure AND remove the docidmap. So will give you the best performance but requires a full rebuild of the index (which should be quicker in 2008 as well).

The third option takes the index offline and can’t be used. This allows you to delay the rebuild until a time that is convenient for you and your server. (If you upgrade 10 databases it is probably not wie to have them all rebuilding their full text indexes at the same time).

What about new databases ?

One of the other aspects is that because the full text is in the database you no longer have to enable the database for full text. You simply need to create your full text catalog and away you go.

Be conscious of the docid issue when creating new tables. You will get best performance if the unique index you use for the full text index has an integer key.

The following are the other posts in the series

If you want to try iFTS you can download the SQL Server 2008 from here http://www.microsoft.com/sql/2008/prodinfo/download.mspx



-
Published Wednesday, February 20, 2008 1:06 PM by simonsabin

Comments

Sunday, February 24, 2008 1:30 PM by Barry

# re: SQL Server 2008 – iFTS Performance – DocidMap

If you have an upgraded database in CTP6 the rebuild of the index does not remove the docidmap. You have to DROP and CREATE the index again.

Thursday, March 6, 2008 3:38 PM by simonsabin

# re: SQL Server 2008 – iFTS Performance – DocidMap

Its taken me a while to get round and test that but in my CTP6 if I attach or restore a SQL2005 database the docid map is removed IF i have set the server property to rebuild or reset.

You set the server property either in properties|Advanced-> Full Text Upgrade option or using the sp

EXEC master.dbo.sp_fulltext_service @action=N'upgrade_option', @value=0

Monday, March 10, 2008 8:17 AM by Jafferyjee

# re: SQL Server 2008 – iFTS Performance – DocidMap

Recently we've been playing around with iFTS and following are observations regarding indexes:

1- Rebuilding catalog is slow as compare to SQLServer 2005.

2- In Management studio if we right click  and select rebuild option from the popup menu, it shows a message “Rebuild successful” whereas the process continues running at back-end.

3- After rebuilding indexes on the same table in SQLServer 2005 & 2008 we observed difference in record count when using the same containstable query. SQL Server 2008 is returning less rows as compare to 2005 on same query.

4- Catalog sizes are different in SQL Server 2005 & 2008. For example in SQL Server 2005, catalog size is 3.90 MB and in  SQL Server 2008 catalog size is 6.56 MB for same database.

Tuesday, October 7, 2008 8:51 AM by SQL Server, BizTalk Server, le 64 bits et au-delà !...

# SQL Server 2008 - Integrated Full Text Search (iFTS) ...

Voici une suite de 6 articles très intéressants sur les nouveautés de la recherche en

Sunday, January 18, 2009 5:09 PM by Indexing too slow? | keyongtech

# Indexing too slow? | keyongtech

Pingback from  Indexing too slow? | keyongtech

Wednesday, February 10, 2010 8:58 AM by Simon Sabin UK SQL Consultant's Blog

# Microsoft ditches Unix in the search market

Wednesday, February 10, 2010 9:43 AM by SimonS Blog on SQL Server Stuff

# Microsoft ditches Unix in the search market

Search is one of my interests, where ever you go, whatever business you are in search is the like a holy