You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "msreddy.hi" <ms...@gmail.com> on 2013/02/04 19:31:11 UTC
Wild card support when stemmers are added
Hi,
I am using solr 3.6.1. I am facing an issue with wild card search when
stemmers(kStemmer / Snowball) added.
I have field called "field_search_1".
<fieldType name="field_search_1" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
<filter class="solr.KStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms/synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
<filter class="solr.KStemFilterFactory"/>
</analyzer>
</fieldType>
When i index "accessorising" into field, it's stemming value as
“accessorise” and indexing.
So if I search for,
1 -> “accessorising” (without any wild cards), query time stemming
converting it to “accessorise”, hence result finds.
2 ->“acc?ssorising” (with wild card ?), query time stemming converting it to
“acc?ssorising”(no change), hence result not able to find.
3 -> “access?rise” (with wild card ?), query time stemming converting it to
“access?rise”(no change), hence result able to find since indexed stemmed
word is “accessorise” .
So, is wild card with stemming doesn't give results as expected? Please
suggest me the solution to match the result in 1 & 2 scenarios.
Thanks in advance.
Saida Reddy.
--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Wild card support when stemmers are added
Posted by "msreddy.hi" <ms...@gmail.com>.
@Mikhail Khludnev - Yes, same thing.
--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038979.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Wild card support when stemmers are added
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Guys,
I'm a little bit out of context, but aren't you talking about
http://searchhub.org/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/?
On Tue, Feb 5, 2013 at 1:40 PM, msreddy.hi <ms...@gmail.com> wrote:
> Thanks Jack.
>
> I will look at the option of implementing work around.
>
> --Saida Reddy.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>
Re: Wild card support when stemmers are added
Posted by Ahmet Arslan <io...@yahoo.com>.
Hi,
You have to separate options where you keep both "accessorising" and "accessorise" at index.
1) https://issues.apache.org/jira/browse/SOLR-3231
2) Create a un-stemmed field and run wildcard queries against it too.
--- On Tue, 2/5/13, msreddy.hi <ms...@gmail.com> wrote:
> From: msreddy.hi <ms...@gmail.com>
> Subject: Re: Wild card support when stemmers are added
> To: dev@lucene.apache.org
> Date: Tuesday, February 5, 2013, 11:40 AM
> Thanks Jack.
>
> I will look at the option of implementing work around.
>
> --Saida Reddy.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
> Sent from the Lucene - Java Developer mailing list archive
> at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Wild card support when stemmers are added
Posted by "msreddy.hi" <ms...@gmail.com>.
Thanks Jack.
I will look at the option of implementing work around.
--Saida Reddy.
--
View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Wild card support when stemmers are added
Posted by Jack Krupansky <ja...@basetechnology.com>.
Yes, as you have discovered, use of a wildcard suppresses a large portion of
the analyzer filter chain, including but not limited to stemming. In short,
you must manually transform the source query term before adding any
wildcard. This is a known issue and limitation, with no known fix other than
the suggested workaround.
-- Jack Krupansky
-----Original Message-----
From: msreddy.hi
Sent: Monday, February 04, 2013 1:31 PM
To: dev@lucene.apache.org
Subject: Wild card support when stemmers are added
Hi,
I am using solr 3.6.1. I am facing an issue with wild card search when
stemmers(kStemmer / Snowball) added.
I have field called "field_search_1".
<fieldType name="field_search_1" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
<filter class="solr.KStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms/synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords/protwords.txt"/>
<filter class="solr.KStemFilterFactory"/>
</analyzer>
</fieldType>
When i index "accessorising" into field, it's stemming value as
“accessorise” and indexing.
So if I search for,
1 -> “accessorising” (without any wild cards), query time stemming
converting it to “accessorise”, hence result finds.
2 ->“acc?ssorising” (with wild card ?), query time stemming converting it to
“acc?ssorising”(no change), hence result not able to find.
3 -> “access?rise” (with wild card ?), query time stemming converting it to
“access?rise”(no change), hence result able to find since indexed stemmed
word is “accessorise” .
So, is wild card with stemming doesn't give results as expected? Please
suggest me the solution to match the result in 1 & 2 scenarios.
Thanks in advance.
Saida Reddy.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org