You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vit <bu...@yahoo.com> on 2014/11/24 19:46:18 UTC

matching shingles issue

I have Solr 4.2.1
I am using the following analyser:
		<fieldType name="text_shingle" class="solr.TextField"
positionIncrementGap="100">
   			<analyzer type="index">
     			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
     			<filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="5"
             			outputUnigrams="true" outputUnigramsIfNoShingles="false"
tokenSeparator=" "/>
   			</analyzer>
   			<analyzer type="query">
     			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
     			<filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="5"
             			outputUnigrams="false" outputUnigramsIfNoShingles="true"
tokenSeparator=" "/>
   			</analyzer>
 		</fieldType>



for Query: 
description_shingle:Highest quality

I am getting Result:
<arr name="description_shingle">
      <str>Highest standards of quality installations!</str>
</arr>

So the result does not have shingle "Highest quality"
Instead it has 
"Highest standards of quality"

The question is why I am getting this match



--
View this message in context: http://lucene.472066.n3.nabble.com/matching-shingles-issue-tp4170685.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: matching shingles issue

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
maybe try

description_shingle:(Highest quality)


On 11/24/14 1:46 PM, vit wrote:
> I have Solr 4.2.1
> I am using the following analyser:
> 		<fieldType name="text_shingle" class="solr.TextField"
> positionIncrementGap="100">
>     			<analyzer type="index">
>       			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       			<filter class="solr.ShingleFilterFactory" minShingleSize="2"
> maxShingleSize="5"
>               			outputUnigrams="true" outputUnigramsIfNoShingles="false"
> tokenSeparator=" "/>
>     			</analyzer>
>     			<analyzer type="query">
>       			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       			<filter class="solr.ShingleFilterFactory" minShingleSize="2"
> maxShingleSize="5"
>               			outputUnigrams="false" outputUnigramsIfNoShingles="true"
> tokenSeparator=" "/>
>     			</analyzer>
>   		</fieldType>
>
>
>
> for Query:
> description_shingle:Highest quality
>
> I am getting Result:
> <arr name="description_shingle">
>        <str>Highest standards of quality installations!</str>
> </arr>
>
> So the result does not have shingle "Highest quality"
> Instead it has
> "Highest standards of quality"
>
> The question is why I am getting this match
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/matching-shingles-issue-tp4170685.html
> Sent from the Solr - User mailing list archive at Nabble.com.