You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dhanesh Radhakrishnan <dh...@hifx.co.in> on 2020/01/09 15:19:53 UTC

Solr suggester : duplicate suggestions

Dear all,
I'm facing two issues with solr suggester component.

*First *
If I typed "Fire and safety", I'll get the result. But If I type "Fire &
safety" suggester is not showing

*Second*
I'm getting duplicate suggestions  in suggester

 "suggest": {
        "categorySuggester": {
            "software": {
                "numFound": 100,
                "suggestions": [
                    {
                        "term": "Software And Web Development||6070",
                        "weight": 0,
                        "payload": ""
                    },
                    {
                        "term": "Software And Web Development||6070",
                        "weight": 0,
                        "payload": ""
                    },
                    {
                        "term": "Software And Web Development||6070",
                        "weight": 0,
                        "payload": ""
                    }
                    ........
                    ........
                    ........

                ]
            }
        }
    }



Here is my configuration

In solrconfig.xml


<searchComponent name="suggest" class="solr.SuggestComponent">
        <lst name="suggester">
            <str name="name">categorySuggester</str>
            <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
            <str name="suggestAnalyzerFieldType">text_suggest</str>
            <str name="dictionaryImpl">DocumentDictionaryFactory</str>
            <str name="field">categoryAutoComplete</str>
           <str name="weightField">categoryRank</str>
            <str name="buildOnStartup">false</str>
            <str name="buildOnCommit">false</str>
            <str name="indexPath">/dictionary/category</str>
            <bool name="exactMatchFirst">true</bool>
            <str name="highlight">false</str>
  </lst>
</searchComponent>



In schema.xml

<field name="categoryAutoComplete" type="text_suggest" indexed="true"
stored="true"  multiValued="true" />


<fieldType class="solr.TextField" name="text_suggest"
positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"
tokenizerFactory="solr.KeywordTokenizerFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        <!-- <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="(&amp;amp;)" replacement="&amp;"/> -->
    </analyzer>
</fieldType>

http://localhost:8983/solr/core-name/suggest?suggest=true&suggest.q=software&suggest.build=false&suggest.dictionary=categorySuggester&wt=json

 Please help

Thanks & Regards,
dhanesh s r


Dhanesh S.RSenior Technical Leade : dhanesh@hifx.co.in  | w : www.hifx.in712
 t   : (+91) 484 4011750
m : (+91) 994 666 6703

-- 
IMPORTANT: This is an e-mail from HiFX IT Media Services Pvt. Ltd. Its 
content are confidential to the intended recipient. If you are not the 
intended recipient, be advised that you have received this e-mail in error 
and that any use, dissemination, forwarding, printing or copying of this 
e-mail is strictly prohibited. It may not be disclosed to or used by anyone 
other than its intended recipient, nor may it be copied in any way. If 
received in error, please email a reply to the sender, then delete it from 
your system. 

Although this e-mail has been scanned for viruses, HiFX 
cannot ultimately accept any responsibility for viruses and it is your 
responsibility to scan attachments (if any).

​Before you print this email 
or attachments, please consider the negative environmental impacts 
associated with printing.

Re: Solr suggester : duplicate suggestions

Posted by Dhanesh Radhakrishnan <dh...@hifx.co.in>.
@Paras Lehana.. Thanks for the reply
Yes "and" is present in the stop words list.



Dhanesh S.RSenior Technical Leade : dhanesh@hifx.co.in  | w : www.hifx.in712
 t   : (+91) 484 4011750
m : (+91) 994 666 6703


On Fri, Jan 10, 2020 at 3:07 PM Paras Lehana <pa...@indiamart.com>
wrote:

> Hi Dhanesh,
>
> Although I handle Auto-Suggest, I have worked a little with Suggester
> component. Suggester provides results as you type. Do you really need it?
>
> Also, I don't know if I'm correct, but where have you described '&' to be
> replaced with 'and'? Is 'and' present in your stopwords list?
>
> I think posting the query and results for both cases of first problem will
> help us more.
>
> On Thu, 9 Jan 2020 at 20:50, Dhanesh Radhakrishnan <dh...@hifx.co.in>
> wrote:
>
> > Dear all,
> > I'm facing two issues with solr suggester component.
> >
> > *First *
> > If I typed "Fire and safety", I'll get the result. But If I type "Fire &
> > safety" suggester is not showing
> >
> > *Second*
> > I'm getting duplicate suggestions  in suggester
> >
> >  "suggest": {
> >         "categorySuggester": {
> >             "software": {
> >                 "numFound": 100,
> >                 "suggestions": [
> >                     {
> >                         "term": "Software And Web Development||6070",
> >                         "weight": 0,
> >                         "payload": ""
> >                     },
> >                     {
> >                         "term": "Software And Web Development||6070",
> >                         "weight": 0,
> >                         "payload": ""
> >                     },
> >                     {
> >                         "term": "Software And Web Development||6070",
> >                         "weight": 0,
> >                         "payload": ""
> >                     }
> >                     ........
> >                     ........
> >                     ........
> >
> >                 ]
> >             }
> >         }
> >     }
> >
> >
> >
> > Here is my configuration
> >
> > In solrconfig.xml
> >
> >
> > <searchComponent name="suggest" class="solr.SuggestComponent">
> >         <lst name="suggester">
> >             <str name="name">categorySuggester</str>
> >             <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
> >             <str name="suggestAnalyzerFieldType">text_suggest</str>
> >             <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> >             <str name="field">categoryAutoComplete</str>
> >            <str name="weightField">categoryRank</str>
> >             <str name="buildOnStartup">false</str>
> >             <str name="buildOnCommit">false</str>
> >             <str name="indexPath">/dictionary/category</str>
> >             <bool name="exactMatchFirst">true</bool>
> >             <str name="highlight">false</str>
> >   </lst>
> > </searchComponent>
> >
> >
> >
> > In schema.xml
> >
> > <field name="categoryAutoComplete" type="text_suggest" indexed="true"
> > stored="true"  multiValued="true" />
> >
> >
> > <fieldType class="solr.TextField" name="text_suggest"
> > positionIncrementGap="100">
> >     <analyzer type="index">
> >         <tokenizer class="solr.StandardTokenizerFactory"/>
> >         <filter class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >         <filter class="solr.StandardFilterFactory"/>
> >         <filter class="solr.LowerCaseFilterFactory"/>
> >         <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="stopwords.txt"/>
> >         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> > ignoreCase="true" expand="true"
> > tokenizerFactory="solr.KeywordTokenizerFactory"/>
> >         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >         <!-- <charFilter class="solr.PatternReplaceCharFilterFactory"
> > pattern="(&amp;amp;)" replacement="&amp;"/> -->
> >     </analyzer>
> > </fieldType>
> >
> >
> >
> http://localhost:8983/solr/core-name/suggest?suggest=true&suggest.q=software&suggest.build=false&suggest.dictionary=categorySuggester&wt=json
> >
> >  Please help
> >
> > Thanks & Regards,
> > dhanesh s r
> >
> >
> > Dhanesh S.RSenior Technical Leade : dhanesh@hifx.co.in  | w :
> > www.hifx.in712
> >  t   : (+91) 484 4011750
> > m : (+91) 994 666 6703
> >
> > --
> > IMPORTANT: This is an e-mail from HiFX IT Media Services Pvt. Ltd. Its
> > content are confidential to the intended recipient. If you are not the
> > intended recipient, be advised that you have received this e-mail in
> error
> > and that any use, dissemination, forwarding, printing or copying of this
> > e-mail is strictly prohibited. It may not be disclosed to or used by
> > anyone
> > other than its intended recipient, nor may it be copied in any way. If
> > received in error, please email a reply to the sender, then delete it
> from
> > your system.
> >
> > Although this e-mail has been scanned for viruses, HiFX
> > cannot ultimately accept any responsibility for viruses and it is your
> > responsibility to scan attachments (if any).
> >
> > ​Before you print this email
> > or attachments, please consider the negative environmental impacts
> > associated with printing.
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>
> --
> *
> *
>
>  <https://www.facebook.com/IndiaMART/videos/578196442936091/>
>

-- 
IMPORTANT: This is an e-mail from HiFX IT Media Services Pvt. Ltd. Its 
content are confidential to the intended recipient. If you are not the 
intended recipient, be advised that you have received this e-mail in error 
and that any use, dissemination, forwarding, printing or copying of this 
e-mail is strictly prohibited. It may not be disclosed to or used by anyone 
other than its intended recipient, nor may it be copied in any way. If 
received in error, please email a reply to the sender, then delete it from 
your system. 

Although this e-mail has been scanned for viruses, HiFX 
cannot ultimately accept any responsibility for viruses and it is your 
responsibility to scan attachments (if any).

​Before you print this email 
or attachments, please consider the negative environmental impacts 
associated with printing.

Re: Solr suggester : duplicate suggestions

Posted by Paras Lehana <pa...@indiamart.com>.
Hi Dhanesh,

Although I handle Auto-Suggest, I have worked a little with Suggester
component. Suggester provides results as you type. Do you really need it?

Also, I don't know if I'm correct, but where have you described '&' to be
replaced with 'and'? Is 'and' present in your stopwords list?

I think posting the query and results for both cases of first problem will
help us more.

On Thu, 9 Jan 2020 at 20:50, Dhanesh Radhakrishnan <dh...@hifx.co.in>
wrote:

> Dear all,
> I'm facing two issues with solr suggester component.
>
> *First *
> If I typed "Fire and safety", I'll get the result. But If I type "Fire &
> safety" suggester is not showing
>
> *Second*
> I'm getting duplicate suggestions  in suggester
>
>  "suggest": {
>         "categorySuggester": {
>             "software": {
>                 "numFound": 100,
>                 "suggestions": [
>                     {
>                         "term": "Software And Web Development||6070",
>                         "weight": 0,
>                         "payload": ""
>                     },
>                     {
>                         "term": "Software And Web Development||6070",
>                         "weight": 0,
>                         "payload": ""
>                     },
>                     {
>                         "term": "Software And Web Development||6070",
>                         "weight": 0,
>                         "payload": ""
>                     }
>                     ........
>                     ........
>                     ........
>
>                 ]
>             }
>         }
>     }
>
>
>
> Here is my configuration
>
> In solrconfig.xml
>
>
> <searchComponent name="suggest" class="solr.SuggestComponent">
>         <lst name="suggester">
>             <str name="name">categorySuggester</str>
>             <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>             <str name="suggestAnalyzerFieldType">text_suggest</str>
>             <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>             <str name="field">categoryAutoComplete</str>
>            <str name="weightField">categoryRank</str>
>             <str name="buildOnStartup">false</str>
>             <str name="buildOnCommit">false</str>
>             <str name="indexPath">/dictionary/category</str>
>             <bool name="exactMatchFirst">true</bool>
>             <str name="highlight">false</str>
>   </lst>
> </searchComponent>
>
>
>
> In schema.xml
>
> <field name="categoryAutoComplete" type="text_suggest" indexed="true"
> stored="true"  multiValued="true" />
>
>
> <fieldType class="solr.TextField" name="text_suggest"
> positionIncrementGap="100">
>     <analyzer type="index">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.StandardFilterFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"
> tokenizerFactory="solr.KeywordTokenizerFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>         <!-- <charFilter class="solr.PatternReplaceCharFilterFactory"
> pattern="(&amp;amp;)" replacement="&amp;"/> -->
>     </analyzer>
> </fieldType>
>
>
> http://localhost:8983/solr/core-name/suggest?suggest=true&suggest.q=software&suggest.build=false&suggest.dictionary=categorySuggester&wt=json
>
>  Please help
>
> Thanks & Regards,
> dhanesh s r
>
>
> Dhanesh S.RSenior Technical Leade : dhanesh@hifx.co.in  | w :
> www.hifx.in712
>  t   : (+91) 484 4011750
> m : (+91) 994 666 6703
>
> --
> IMPORTANT: This is an e-mail from HiFX IT Media Services Pvt. Ltd. Its
> content are confidential to the intended recipient. If you are not the
> intended recipient, be advised that you have received this e-mail in error
> and that any use, dissemination, forwarding, printing or copying of this
> e-mail is strictly prohibited. It may not be disclosed to or used by
> anyone
> other than its intended recipient, nor may it be copied in any way. If
> received in error, please email a reply to the sender, then delete it from
> your system.
>
> Although this e-mail has been scanned for viruses, HiFX
> cannot ultimately accept any responsibility for viruses and it is your
> responsibility to scan attachments (if any).
>
> ​Before you print this email
> or attachments, please consider the negative environmental impacts
> associated with printing.
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>