You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peyman Faratin <pe...@robustlinks.com> on 2012/06/17 00:09:00 UTC
KeywordTokenizerFactory with SynonymFilterFactory
Hi
I have the following 2 field types
<fieldType name="tokenizer1" class="solr.TextField" sortMissingLast="true" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="tokenizer2" class="solr.TextField" sortMissingLast="true" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true"/>
</analyzer>
</fieldType>
The problem I am seeing is if I have an entry as this in the "synonyms.txt" file
helping hand => assistance
then issuing "helping hand" query (with dismax) to the field tokenized with tokenizer1 returns the correct query ("assistance") whereas there is no synonym mapping for tokenizer2 (confirmed in Solr admin panel).
Am I doing something wrong?
thank you
Re: KeywordTokenizerFactory with SynonymFilterFactory
Posted by Peyman Faratin <pe...@robustlinks.com>.
thank you Michael.
On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote:
> Try changing the tokenizer2 SynonymFilterFactory filter to this:
>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>
>
> By default, it seems that it uses WhitespaceTokenizer.
>
> -Michael
RE: KeywordTokenizerFactory with SynonymFilterFactory
Posted by Michael Ryan <mr...@moreover.com>.
Try changing the tokenizer2 SynonymFilterFactory filter to this:
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="false" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>
By default, it seems that it uses WhitespaceTokenizer.
-Michael