You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "msreddy.hi" <ms...@gmail.com> on 2013/02/04 19:31:11 UTC

Wild card support when stemmers are added

Hi,

I am using solr 3.6.1. I am facing an issue with wild card search when
stemmers(kStemmer / Snowball) added.

I have field called "field_search_1".

 <fieldType name="field_search_1" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
	    <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
	    <filter class="solr.KStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms/synonyms.txt" ignoreCase="true" expand="true"/>       
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
        <filter class="solr.KStemFilterFactory"/>
      </analyzer>
    </fieldType>

When i index "accessorising" into field, it's stemming value as
“accessorise” and indexing.

So if I search for,

1 -> “accessorising” (without any wild cards), query time stemming
converting it to  “accessorise”, hence result finds.
2 ->“acc?ssorising” (with wild card ?), query time stemming converting it to
“acc?ssorising”(no change), hence result not able to find.
3 -> “access?rise” (with wild card ?), query time stemming converting it to
“access?rise”(no change), hence result able to find since indexed stemmed
word is “accessorise” .

So, is wild card with stemming doesn't give results as expected? Please
suggest me the solution to match the result in 1 & 2 scenarios.

Thanks in advance.

Saida Reddy.




--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Wild card support when stemmers are added

Posted by "msreddy.hi" <ms...@gmail.com>.
@Mikhail Khludnev - Yes, same thing. 




--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038979.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Wild card support when stemmers are added

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Guys,

I'm a little bit out of context, but aren't you talking about
http://searchhub.org/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/?


On Tue, Feb 5, 2013 at 1:40 PM, msreddy.hi <ms...@gmail.com> wrote:

> Thanks Jack.
>
> I will look at the option of implementing work around.
>
> --Saida Reddy.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>

Re: Wild card support when stemmers are added

Posted by Ahmet Arslan <io...@yahoo.com>.
Hi,

You have to separate options where you keep both "accessorising" and "accessorise" at index.

1) https://issues.apache.org/jira/browse/SOLR-3231

2) Create a un-stemmed field and run wildcard queries against it too.



--- On Tue, 2/5/13, msreddy.hi <ms...@gmail.com> wrote:

> From: msreddy.hi <ms...@gmail.com>
> Subject: Re: Wild card support when stemmers are added
> To: dev@lucene.apache.org
> Date: Tuesday, February 5, 2013, 11:40 AM
> Thanks Jack.
> 
> I will look at the option of implementing work around.
> 
> --Saida Reddy.
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
> Sent from the Lucene - Java Developer mailing list archive
> at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Wild card support when stemmers are added

Posted by "msreddy.hi" <ms...@gmail.com>.
Thanks Jack.

I will look at the option of implementing work around.

--Saida Reddy.



--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Wild card support when stemmers are added

Posted by Jack Krupansky <ja...@basetechnology.com>.
Yes, as you have discovered, use of a wildcard suppresses a large portion of 
the analyzer filter chain, including but not limited to stemming. In short, 
you must manually transform the source query term before adding any 
wildcard. This is a known issue and limitation, with no known fix other than 
the suggested workaround.

-- Jack Krupansky

-----Original Message----- 
From: msreddy.hi
Sent: Monday, February 04, 2013 1:31 PM
To: dev@lucene.apache.org
Subject: Wild card support when stemmers are added

Hi,

I am using solr 3.6.1. I am facing an issue with wild card search when
stemmers(kStemmer / Snowball) added.

I have field called "field_search_1".

<fieldType name="field_search_1" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
    <filter class="solr.KStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms/synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
        <filter class="solr.KStemFilterFactory"/>
      </analyzer>
    </fieldType>

When i index "accessorising" into field, it's stemming value as
“accessorise” and indexing.

So if I search for,

1 -> “accessorising” (without any wild cards), query time stemming
converting it to  “accessorise”, hence result finds.
2 ->“acc?ssorising” (with wild card ?), query time stemming converting it to
“acc?ssorising”(no change), hence result not able to find.
3 -> “access?rise” (with wild card ?), query time stemming converting it to
“access?rise”(no change), hence result able to find since indexed stemmed
word is “accessorise” .

So, is wild card with stemming doesn't give results as expected? Please
suggest me the solution to match the result in 1 & 2 scenarios.

Thanks in advance.

Saida Reddy.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org