You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vit <bu...@yahoo.com> on 2014/11/24 19:46:18 UTC
matching shingles issue
I have Solr 4.2.1
I am using the following analyser:
<fieldType name="text_shingle" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="5"
outputUnigrams="true" outputUnigramsIfNoShingles="false"
tokenSeparator=" "/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="5"
outputUnigrams="false" outputUnigramsIfNoShingles="true"
tokenSeparator=" "/>
</analyzer>
</fieldType>
for Query:
description_shingle:Highest quality
I am getting Result:
<arr name="description_shingle">
<str>Highest standards of quality installations!</str>
</arr>
So the result does not have shingle "Highest quality"
Instead it has
"Highest standards of quality"
The question is why I am getting this match
--
View this message in context: http://lucene.472066.n3.nabble.com/matching-shingles-issue-tp4170685.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: matching shingles issue
Posted by Michael Sokolov <ms...@safaribooksonline.com>.
maybe try
description_shingle:(Highest quality)
On 11/24/14 1:46 PM, vit wrote:
> I have Solr 4.2.1
> I am using the following analyser:
> <fieldType name="text_shingle" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.ShingleFilterFactory" minShingleSize="2"
> maxShingleSize="5"
> outputUnigrams="true" outputUnigramsIfNoShingles="false"
> tokenSeparator=" "/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.ShingleFilterFactory" minShingleSize="2"
> maxShingleSize="5"
> outputUnigrams="false" outputUnigramsIfNoShingles="true"
> tokenSeparator=" "/>
> </analyzer>
> </fieldType>
>
>
>
> for Query:
> description_shingle:Highest quality
>
> I am getting Result:
> <arr name="description_shingle">
> <str>Highest standards of quality installations!</str>
> </arr>
>
> So the result does not have shingle "Highest quality"
> Instead it has
> "Highest standards of quality"
>
> The question is why I am getting this match
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/matching-shingles-issue-tp4170685.html
> Sent from the Solr - User mailing list archive at Nabble.com.