SQL Server - Multi-Column substring matching

Posted by hamlin11 on Stack Overflow See other posts from Stack Overflow or by hamlin11
Published on 2010-03-12T23:28:36Z Indexed on 2010/03/13 0:07 UTC
Read the original article Hit count: 394

One of my clients is hooked on multi-column substring matching.

I understand that Contains and FreeText search for words (and at least in the case of Contains, word prefixes). However, based upon my understanding of this MSDN book, neither of these nor their variants are capable of searching substrings.

I have used LIKE rather extensively (Select * from A where A.B Like '%substr%')

Sample table A:

ID | Col1     | Col2     | Col3     |
-------------------------------------
1  | oklahoma | colorado | Utah     |
2  | arkansas | colorado | oklahoma |
3  | florida  | michigan | florida  |
-------------------------------------

The following code will give us row 1 and row 2:

 select * from A where Col1 like '%klah%' or Col2 like '%klah%' or Col3 like '%klah%'

This is rather ugly, probably slow, and I just don't like it very much. Probably because the implementations that I'm dealing with have 10+ columns that need searched.

The following may be a slight improvement as code readability goes, but as far as performance, we're still in the same ball park.

 select * from A where (Col1 + ' ' + Col2 + ' ' + Col3) like '%klah%'

I have thought about simply adding insert, update, and delete triggers that simply add the concatenated version of the above columns into a separate table that shadows this table.

Sample Shadow_Table:

ID | searchtext                 |
---------------------------------
1  | oklahoma colorado Utah     |
2  | arkansas colorado oklahoma |
3  | florida michigan florida   |
---------------------------------

This would allow us to perform the following query to search for '%klah%'

select * from Shadow_Table where searchtext like '%klah%'

I really don't like having to remember that this shadow table exists and that I'm supposed to use it when I am performing multi-column substring matching, but it probably yields pretty quick reads at the expense of write and storage space.

My gut feeling tells me there there is an existing solution built into SQL Server 2008. However, I don't seem to be able to find anything other than research papers on the subject.

Any help would be appreciated.

© Stack Overflow or respective owner

Related posts about sql

Related posts about sql-server-2008