You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Robert Brown <ro...@intelcompute.com> on 2012/01/30 15:02:24 UTC
"sage 200" not matching "... sage 200."
The trailing full-stop above is not being matched when searching for
"sage 200" for the below field type...
Do I need the WordDelimiterFilterFactory for this to work as expected?
I don't see any mention of periods being discussed in the docs.
<fieldType name="textgen" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="textgen-synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="textgen-synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Thanks,
Rob
--
IntelCompute
Web Design & Local Online Marketing
http://www.intelcompute.com
Re: "sage 200" not matching "... sage 200."
Posted by Ahmet Arslan <io...@yahoo.com>.
> The trailing full-stop above is not
> being matched when searching for "sage 200" for the below
> field type...
>
> Do I need the WordDelimiterFilterFactory for this to work as
> expected? I don't see any mention of periods being discussed
> in the docs.
>
>
> <fieldType name="textgen" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
> <filter
> class="solr.SynonymFilterFactory"
> synonyms="textgen-synonyms.txt" ignoreCase="true"
> expand="true"/>
> <filter
> class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
> <filter
> class="solr.SynonymFilterFactory"
> synonyms="textgen-synonyms.txt" ignoreCase="true"
> expand="true"/>
> <filter
> class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
White space tokenizer leaves periods. Either use StandardTokenizer or include WordDelimeterFilter.
Analysis page visualizes created tokens, it is useful when testing/understanding tokenizer/filter behavior.