You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tim Terlegård <ti...@gmail.com> on 2012/03/26 13:10:26 UTC

Querying field with parenthesis

I have created my own field type. I have indexed "Stephen King" and
get no hit when searching
author:(stephen king)

I get a hit when searching like this
author:(stephen* AND *king)

I also get a hit when searching like this
author:"stephen king"

So it seems like when querying with (...) it actually splits the
words. This is the type of the author field

    <fieldType name="string_lowercase" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

I expected that author:(stephen king) would do the same thing as
author:"stephen king". Why is this not the case?

Thanks,
Tim

Re: Querying field with parenthesis

Posted by Erick Erickson <er...@gmail.com>.
Your problem is the KeywordTokenizerFactory and the query parser.
This often trips people up. When you use author:(stephen king), the
query parser breaks this up before it gets to the analysis chain
into two separate tokens. But by virtue of the
fact that you're using KeywordTokenizer, the actual field only
has a single token "stephen king". So neither of the
pieces match. When you put "stephen king" (with quotes)
in, the query parser does not try to break the tokens up and
the analysis chain gets a single token rather than two.

Your wildcards are matching because steph* and *king
both match the _single_ token "stephen king".

Two ways you can get lots of help with this kind of
think is the admin/analysis page and attaching
&debugQuery=on to your URL and look at the
parsed query results.

Using something like WhitespaceTokenizerFactory might
give you more expected results.

Best
Erick

2012/3/26 Tim Terlegård <ti...@gmail.com>:
> I have created my own field type. I have indexed "Stephen King" and
> get no hit when searching
> author:(stephen king)
>
> I get a hit when searching like this
> author:(stephen* AND *king)
>
> I also get a hit when searching like this
> author:"stephen king"
>
> So it seems like when querying with (...) it actually splits the
> words. This is the type of the author field
>
>    <fieldType name="string_lowercase" class="solr.TextField">
>      <analyzer>
>        <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>      </analyzer>
>    </fieldType>
>
> I expected that author:(stephen king) would do the same thing as
> author:"stephen king". Why is this not the case?
>
> Thanks,
> Tim