Lucene Search Returning Extra, Undesired Records

Posted by Brandon on Stack Overflow See other posts from Stack Overflow or by Brandon
Published on 2010-04-29T13:31:14Z Indexed on 2010/04/29 13:37 UTC
Read the original article Hit count: 148

Filed under:
|
|
|

I have a Lucene index that contains a field called 'Name'.

I escape all special characters before inserting a value into my index using QueryParser.Escape(value).

In my example I have 2 documents with the following names respectively:

Test
Test (Test)

They get inserted into my index as such (I can confirm this using Luke):

[test]
[test] [\(test\)]

I insert these values as TOKENIZED and using the StandardAnalyzer.

When I perform a search, I use the QueryParser.Escape(searchString) against my search string input to escape special characters and then use the QueryParser with my 'Name' field and the StandardAnalyzer to perform my search.

When I perform a search for 'Test', I get back both documents in my index (as expected). However, when I perform a search for 'Test (Test)', I am getting back both documents still.

I realize that in both examples it matches on the 'test' term in the index, but I am confused in my 2nd example why it would not just pull back the document with the value of 'Test (Test)' because my search should create two terms:

[test] and [\(test\)] 

I would imagine it would perform some sort of boolean operator where BOTH terms must match in that situation so I would get back just one record.

Is there something I am missing or a trick to make the search behave as desired?

© Stack Overflow or respective owner

Related posts about c#

Related posts about lucene