Remove undesired indexed keywords from Sql Server FTS Index

Posted by Scott on Stack Overflow See other posts from Stack Overflow or by Scott
Published on 2010-04-09T19:17:52Z Indexed on 2010/04/09 19:23 UTC
Read the original article Hit count: 347

Could anyone tell me if SQL Server 2008 has a way to prevent keywords from being indexed that aren't really relevant to the types of searches that will be performed?

For example, we have the IFilters for PDF and Word hooked in and our documents are being indexed properly as far as I can tell. These documents, however, have lots of numeric values in them that people won't really be searching for or bring back meaningful results. These are still being indexed and creating lots of entries in the full text catalog. Basically we are trying to optimize our search engine in any way we can and assumed all these unnecessary entries couldn't be helping performance. I want my catalog to consist of alphabetic keywords only. The current iFilters work better than I would be able to write in the time I have but it just has more than I need.

This is an example of some of the terms from sys.dm_fts_index_keywords_by_document that I want out:

$1,000, $100, $250, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 129, 13.1, 14, 14.12, 145, 15, 16.2, 16.4, 18, 18.1, 18.2, 18.3, 18.4, 18.5

These are some examples from the same management view that I think are desirable for keeping and searching on:

above, accordingly, accounts, add, addition, additional, additive

Any help would be greatly appreciated!

© Stack Overflow or respective owner

Related posts about sql-server-2008

Related posts about full-text-search