You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peyman Faratin <pe...@robustlinks.com> on 2012/06/17 00:09:00 UTC

KeywordTokenizerFactory with SynonymFilterFactory

Hi

I have the following 2 field types

<fieldType name="tokenizer1" class="solr.TextField" sortMissingLast="true" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true"/> 
      	<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
</fieldType>


<fieldType name="tokenizer2" class="solr.TextField" sortMissingLast="true" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true"/> 
      </analyzer>
</fieldType>	

The problem I am seeing is if I have an entry as this in the "synonyms.txt" file

helping hand => assistance

then issuing "helping hand" query (with dismax) to the field tokenized with tokenizer1 returns the correct query ("assistance") whereas there is no synonym mapping for tokenizer2 (confirmed in Solr admin panel). 

Am I doing something wrong?

thank you



Re: KeywordTokenizerFactory with SynonymFilterFactory

Posted by Peyman Faratin <pe...@robustlinks.com>.
thank you Michael.

On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote:

> Try changing the tokenizer2 SynonymFilterFactory filter to this:
> 
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>
> 
> By default, it seems that it uses WhitespaceTokenizer.
> 
> -Michael


RE: KeywordTokenizerFactory with SynonymFilterFactory

Posted by Michael Ryan <mr...@moreover.com>.
Try changing the tokenizer2 SynonymFilterFactory filter to this:

<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>

By default, it seems that it uses WhitespaceTokenizer.

-Michael