You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rohan Thakur <ro...@gmail.com> on 2013/04/09 08:32:49 UTC

spell suggestions help

hi all

one thing I wanted to clear is for every other query I have got correct
suggestions but these 2 cases I am not getting what suppose to be the
suggestions:

1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed
in direct solr spell cheker..but when I query for cattle I get cable as
only suggestion and not kettle why is this happening i want to get kettle
in suggestion as well im using jarowinkler distance according to which
score for cattle => cable which is coming out to be 0.857 and for cattle =>
kettle which is coming out to be 0.777  kettle should also come in
suggestions but its not how can I correct this any one.

2) how to query for sentence like "hand blandar & chopper" as & is
delimiter for solr query and thus this query is returning error.

thanks in advance
regards
Rohan

Re: spell suggestions help

Posted by Jack Krupansky <ja...@basetechnology.com>.
Be sure to use the Solr Admin UI Analysis page to verify what is happening 
at each stage of analysis. For BOTH "index" and "query".

You only showed us your "query" analyzer... show us the "index" analyzer as 
well.

Did you make sure to delete the index data and completely reindex after 
changing the "index" analyzer?

Or maybe your "index" and "query" analyzers are not in-sync and compatible.

Do you have anything in your stopwords file? "and" is usually considered a 
stop word - so the stop filter would remove it.

-- Jack Krupansky

-----Original Message----- 
From: Rohan Thakur
Sent: Friday, April 12, 2013 2:12 AM
To: solr-user@lucene.apache.org
Subject: Re: spell suggestions help

hi jack

I am using whitespace toknizer only and before this im using pattern
replace to replace &amp; with and but its not working I guess.

my query analyser:
</analyzer>
      <analyzer type="query">
     <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="&amp;" replacement="and"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"



On Thu, Apr 11, 2013 at 6:03 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> Try replacing standard tokenizer with whitespace tokenizer in your field
> types. And make sure not to use any other token filters that might discard
> special characters (or provide a character map if they support one.)
>
> Also, be side to try your test terms in the Solr Admin UI ANalyzer page to
> see that the "&" is preserved or which stage in term analysis it gets
> discarded.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Rohan Thakur
> Sent: Thursday, April 11, 2013 7:39 AM
> To: solr-user@lucene.apache.org
> Subject: Re: spell suggestions help
>
>
> urlencode replaces & with space thus resulting in results that contains
> even the single terms like in the case of mobile & accessories it replaces
> it with mobile accessories and results the document containing even
> accessories which i dont want. how to tackle this I tried using pattern
> replace filter at query time to replace & with and but it did not worked I
> used &amp; => replace with "and" in this but did not worked any guess our
> help..
>
> thanks
> regards
> rohan
>
>
> On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur <ro...@gmail.com>
> wrote:
>
>  hi erick
>>
>> do we have to do urlencoding from the php side or does solr supports
>> urlencode?
>>
>>
>> On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson <er...@gmail.com>
>> **wrote:
>>
>>  Try URL encoding it and/or escaping the &
>>>
>>> On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com>
>>> wrote:
>>> > hi all
>>> >
>>> > one thing I wanted to clear is for every other query I have got 
>>> > correct
>>> > suggestions but these 2 cases I am not getting what suppose to be the
>>> > suggestions:
>>> >
>>> > 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word
>>> indexed
>>> > in direct solr spell cheker..but when I query for cattle I get cable 
>>> > as
>>> > only suggestion and not kettle why is this happening i want to get
>>> kettle
>>> > in suggestion as well im using jarowinkler distance according to which
>>> > score for cattle => cable which is coming out to be 0.857 and for
>>> cattle =>
>>> > kettle which is coming out to be 0.777  kettle should also come in
>>> > suggestions but its not how can I correct this any one.
>>> >
>>> > 2) how to query for sentence like "hand blandar & chopper" as & is
>>> > delimiter for solr query and thus this query is returning error.
>>> >
>>> > thanks in advance
>>> > regards
>>> > Rohan
>>>
>>>
>>
>>
> 


Re: spell suggestions help

Posted by Rohan Thakur <ro...@gmail.com>.
hi jack

I am using whitespace toknizer only and before this im using pattern
replace to replace &amp; with and but its not working I guess.

my query analyser:
</analyzer>
      <analyzer type="query">
     <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="&amp;" replacement="and"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"



On Thu, Apr 11, 2013 at 6:03 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Try replacing standard tokenizer with whitespace tokenizer in your field
> types. And make sure not to use any other token filters that might discard
> special characters (or provide a character map if they support one.)
>
> Also, be side to try your test terms in the Solr Admin UI ANalyzer page to
> see that the "&" is preserved or which stage in term analysis it gets
> discarded.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Rohan Thakur
> Sent: Thursday, April 11, 2013 7:39 AM
> To: solr-user@lucene.apache.org
> Subject: Re: spell suggestions help
>
>
> urlencode replaces & with space thus resulting in results that contains
> even the single terms like in the case of mobile & accessories it replaces
> it with mobile accessories and results the document containing even
> accessories which i dont want. how to tackle this I tried using pattern
> replace filter at query time to replace & with and but it did not worked I
> used &amp; => replace with "and" in this but did not worked any guess our
> help..
>
> thanks
> regards
> rohan
>
>
> On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur <ro...@gmail.com>
> wrote:
>
>  hi erick
>>
>> do we have to do urlencoding from the php side or does solr supports
>> urlencode?
>>
>>
>> On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson <er...@gmail.com>
>> **wrote:
>>
>>  Try URL encoding it and/or escaping the &
>>>
>>> On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com>
>>> wrote:
>>> > hi all
>>> >
>>> > one thing I wanted to clear is for every other query I have got correct
>>> > suggestions but these 2 cases I am not getting what suppose to be the
>>> > suggestions:
>>> >
>>> > 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word
>>> indexed
>>> > in direct solr spell cheker..but when I query for cattle I get cable as
>>> > only suggestion and not kettle why is this happening i want to get
>>> kettle
>>> > in suggestion as well im using jarowinkler distance according to which
>>> > score for cattle => cable which is coming out to be 0.857 and for
>>> cattle =>
>>> > kettle which is coming out to be 0.777  kettle should also come in
>>> > suggestions but its not how can I correct this any one.
>>> >
>>> > 2) how to query for sentence like "hand blandar & chopper" as & is
>>> > delimiter for solr query and thus this query is returning error.
>>> >
>>> > thanks in advance
>>> > regards
>>> > Rohan
>>>
>>>
>>
>>
>

Re: spell suggestions help

Posted by Jack Krupansky <ja...@basetechnology.com>.
Try replacing standard tokenizer with whitespace tokenizer in your field 
types. And make sure not to use any other token filters that might discard 
special characters (or provide a character map if they support one.)

Also, be side to try your test terms in the Solr Admin UI ANalyzer page to 
see that the "&" is preserved or which stage in term analysis it gets 
discarded.

-- Jack Krupansky

-----Original Message----- 
From: Rohan Thakur
Sent: Thursday, April 11, 2013 7:39 AM
To: solr-user@lucene.apache.org
Subject: Re: spell suggestions help

urlencode replaces & with space thus resulting in results that contains
even the single terms like in the case of mobile & accessories it replaces
it with mobile accessories and results the document containing even
accessories which i dont want. how to tackle this I tried using pattern
replace filter at query time to replace & with and but it did not worked I
used &amp; => replace with "and" in this but did not worked any guess our
help..

thanks
regards
rohan


On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur <ro...@gmail.com> wrote:

> hi erick
>
> do we have to do urlencoding from the php side or does solr supports
> urlencode?
>
>
> On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson 
> <er...@gmail.com>wrote:
>
>> Try URL encoding it and/or escaping the &
>>
>> On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com>
>> wrote:
>> > hi all
>> >
>> > one thing I wanted to clear is for every other query I have got correct
>> > suggestions but these 2 cases I am not getting what suppose to be the
>> > suggestions:
>> >
>> > 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word
>> indexed
>> > in direct solr spell cheker..but when I query for cattle I get cable as
>> > only suggestion and not kettle why is this happening i want to get
>> kettle
>> > in suggestion as well im using jarowinkler distance according to which
>> > score for cattle => cable which is coming out to be 0.857 and for
>> cattle =>
>> > kettle which is coming out to be 0.777  kettle should also come in
>> > suggestions but its not how can I correct this any one.
>> >
>> > 2) how to query for sentence like "hand blandar & chopper" as & is
>> > delimiter for solr query and thus this query is returning error.
>> >
>> > thanks in advance
>> > regards
>> > Rohan
>>
>
> 


Re: spell suggestions help

Posted by Rohan Thakur <ro...@gmail.com>.
urlencode replaces & with space thus resulting in results that contains
even the single terms like in the case of mobile & accessories it replaces
it with mobile accessories and results the document containing even
accessories which i dont want. how to tackle this I tried using pattern
replace filter at query time to replace & with and but it did not worked I
used &amp; => replace with "and" in this but did not worked any guess our
help..

thanks
regards
rohan


On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur <ro...@gmail.com> wrote:

> hi erick
>
> do we have to do urlencoding from the php side or does solr supports
> urlencode?
>
>
> On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson <er...@gmail.com>wrote:
>
>> Try URL encoding it and/or escaping the &
>>
>> On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com>
>> wrote:
>> > hi all
>> >
>> > one thing I wanted to clear is for every other query I have got correct
>> > suggestions but these 2 cases I am not getting what suppose to be the
>> > suggestions:
>> >
>> > 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word
>> indexed
>> > in direct solr spell cheker..but when I query for cattle I get cable as
>> > only suggestion and not kettle why is this happening i want to get
>> kettle
>> > in suggestion as well im using jarowinkler distance according to which
>> > score for cattle => cable which is coming out to be 0.857 and for
>> cattle =>
>> > kettle which is coming out to be 0.777  kettle should also come in
>> > suggestions but its not how can I correct this any one.
>> >
>> > 2) how to query for sentence like "hand blandar & chopper" as & is
>> > delimiter for solr query and thus this query is returning error.
>> >
>> > thanks in advance
>> > regards
>> > Rohan
>>
>
>

Re: spell suggestions help

Posted by Rohan Thakur <ro...@gmail.com>.
hi erick

do we have to do urlencoding from the php side or does solr supports
urlencode?


On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson <er...@gmail.com>wrote:

> Try URL encoding it and/or escaping the &
>
> On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com> wrote:
> > hi all
> >
> > one thing I wanted to clear is for every other query I have got correct
> > suggestions but these 2 cases I am not getting what suppose to be the
> > suggestions:
> >
> > 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed
> > in direct solr spell cheker..but when I query for cattle I get cable as
> > only suggestion and not kettle why is this happening i want to get kettle
> > in suggestion as well im using jarowinkler distance according to which
> > score for cattle => cable which is coming out to be 0.857 and for cattle
> =>
> > kettle which is coming out to be 0.777  kettle should also come in
> > suggestions but its not how can I correct this any one.
> >
> > 2) how to query for sentence like "hand blandar & chopper" as & is
> > delimiter for solr query and thus this query is returning error.
> >
> > thanks in advance
> > regards
> > Rohan
>

Re: spell suggestions help

Posted by Erick Erickson <er...@gmail.com>.
Try URL encoding it and/or escaping the &

On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur <ro...@gmail.com> wrote:
> hi all
>
> one thing I wanted to clear is for every other query I have got correct
> suggestions but these 2 cases I am not getting what suppose to be the
> suggestions:
>
> 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed
> in direct solr spell cheker..but when I query for cattle I get cable as
> only suggestion and not kettle why is this happening i want to get kettle
> in suggestion as well im using jarowinkler distance according to which
> score for cattle => cable which is coming out to be 0.857 and for cattle =>
> kettle which is coming out to be 0.777  kettle should also come in
> suggestions but its not how can I correct this any one.
>
> 2) how to query for sentence like "hand blandar & chopper" as & is
> delimiter for solr query and thus this query is returning error.
>
> thanks in advance
> regards
> Rohan