You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jon kerling <jo...@yahoo.com.INVALID> on 2015/06/18 11:48:37 UTC

Duplicate suggestions

Hi,
I am using solr 5.1. I'm getting duplicate suggestions when using my solrsuggester. I'm using AnalyzingInfixLookupFactory & DocumentDictionaryFactory. can i configure it to suggest me only different suggestions?

here are details about my configuration:

from schema.xml:<searchComponent name="suggest" class="solr.SuggestComponent">
   <lst name="suggester">
      <str name="name">mySuggester1a</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>      
      <str name="indexPath">suggester_infix_dir1a</str>
      <str name="allTermsRequired">true</str>
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>     
      <str name="field">f1</str>
      <str name="weightField">weightField</str>
      <str name="suggestAnalyzerFieldType">text_general</str>
      <str name="buildOnStartup">false</str>
    </lst>

      <lst name="suggester">
      <str name="name">mySuggester2a</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>      
      <str name="indexPath">suggester_infix_dir2a</str>
      <str name="allTermsRequired">true</str>
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>     
      <str name="field">f2</str>
      <str name="weightField">weightField</str>
      <str name="suggestAnalyzerFieldType">text_general</str>
      <str name="buildOnStartup">false</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">6</str>
      <str name="suggest.dictionary">mySuggester1a</str>
      <str name="suggest.dictionary">mySuggester2a</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler> 

from schema.xml:<field name="f1" type="string" indexed="true" stored="true" required="false" multiValued="false" />
<field name="f2" type="string" indexed="true" stored="true" required="false" multiValued="false" /><Field name="weightField"  type="float"  indexed="true"  stored="true"/>
** weightField is ignored by me, I'm not adding any values in it at all.

document example:<doc>    <str name="f1">2015-04-01</str>    <str name="f2">12:06:00</str>    <str name="f3">BOOO</str>    <str name="f4"/>    <str name="f5">7.52.11.212</str>    <str name="f6">7.52.11.213</str>    <str name="OID">52358424</str></doc>
After i build the suggester I'm trying to get suggests like here:
http://localhost/solr/core1/suggest?/suggest=true&suggest.q=12

<?xml version="1.0" encoding="UTF-8"?>
<response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">62</int>
   </lst>
   <lst name="suggest">
      <lst name="mySuggester2a">
         <lst name="12">
            <int name="numFound">6</int>
            <arr name="suggestions">
               <lst>
                  <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">&lt;b&gt;12&lt;/b&gt;:06:02</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
            </arr>
         </lst>
      </lst>
      <lst name="mySuggester1a">
         <lst name="12">
            <int name="numFound">0</int>
            <arr name="suggestions" />
         </lst>
      </lst>
   </lst>
</response>

I would like to get this kind of suggester response ( no duplicates ):

<?xml version="1.0" encoding="UTF-8"?>
<response>
   <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">62</int>
   </lst>
   <lst name="suggest">
      <lst name="mySuggester2a">
         <lst name="12">
            <int name="numFound">3</int>
            <arr name="suggestions">
               <lst>
                  <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
               <lst>
                  <str name="term">&lt;b&gt;12&lt;/b&gt;:06:02</str>
                  <long name="weight">0</long>
                  <str name="payload" />
               </lst>
            </arr>
         </lst>
      </lst>
      <lst name="mySuggester1a">
         <lst name="12">
            <int name="numFound">0</int>
            <arr name="suggestions" />
         </lst>
      </lst>
   </lst>
</response>Thank you.

Re: Duplicate suggestions

Posted by jon kerling <jo...@yahoo.com.INVALID>.
I got an intermediate API, so I'll change the collection type as you suggested.Thank you for your reply.



   

Re: Duplicate suggestions

Posted by Alessandro Benedetti <be...@gmail.com>.
I had the very same issue,
because I had some document with a redundant field, and I was using the
Infix Suggester as well.

Because the Infix Suggester returns the whole field content, if you have
duplicated fields across your docs, you will se duplicate suggestions.

Do you have any intermediate API in your application ? In the case you can
modify the API using a Collection that prevent duplicates to contain and
return the suggestions.

In the case you want it directly from Solr I assume it is a "bug" .
I think the suggestions should return by default no duplicates ( because
the only information returned is the  field value and not the document id.
Anyway could be a nice parameter to get better suggestions ( sending the
avoidDuplicate parameter to the suggester 0.

Cheers

2015-06-18 10:48 GMT+01:00 jon kerling <jo...@yahoo.com.invalid>:

> Hi,
> I am using solr 5.1. I'm getting duplicate suggestions when using my
> solrsuggester. I'm using AnalyzingInfixLookupFactory &
> DocumentDictionaryFactory. can i configure it to suggest me only different
> suggestions?
>
> here are details about my configuration:
>
> from schema.xml:<searchComponent name="suggest"
> class="solr.SuggestComponent">
>    <lst name="suggester">
>       <str name="name">mySuggester1a</str>
>       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>       <str name="indexPath">suggester_infix_dir1a</str>
>       <str name="allTermsRequired">true</str>
>       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>       <str name="field">f1</str>
>       <str name="weightField">weightField</str>
>       <str name="suggestAnalyzerFieldType">text_general</str>
>       <str name="buildOnStartup">false</str>
>     </lst>
>
>       <lst name="suggester">
>       <str name="name">mySuggester2a</str>
>       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>       <str name="indexPath">suggester_infix_dir2a</str>
>       <str name="allTermsRequired">true</str>
>       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>       <str name="field">f2</str>
>       <str name="weightField">weightField</str>
>       <str name="suggestAnalyzerFieldType">text_general</str>
>       <str name="buildOnStartup">false</str>
>     </lst>
>   </searchComponent>
>
>   <requestHandler name="/suggest" class="solr.SearchHandler"
> startup="lazy">
>     <lst name="defaults">
>       <str name="suggest">true</str>
>       <str name="suggest.count">6</str>
>       <str name="suggest.dictionary">mySuggester1a</str>
>       <str name="suggest.dictionary">mySuggester2a</str>
>     </lst>
>     <arr name="components">
>       <str>suggest</str>
>     </arr>
>   </requestHandler>
>
> from schema.xml:<field name="f1" type="string" indexed="true"
> stored="true" required="false" multiValued="false" />
> <field name="f2" type="string" indexed="true" stored="true"
> required="false" multiValued="false" /><Field name="weightField"
> type="float"  indexed="true"  stored="true"/>
> ** weightField is ignored by me, I'm not adding any values in it at all.
>
> document example:<doc>    <str name="f1">2015-04-01</str>    <str
> name="f2">12:06:00</str>    <str name="f3">BOOO</str>    <str
> name="f4"/>    <str name="f5">7.52.11.212</str>    <str
> name="f6">7.52.11.213</str>    <str name="OID">52358424</str></doc>
> After i build the suggester I'm trying to get suggests like here:
> http://localhost/solr/core1/suggest?/suggest=true&suggest.q=12
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>    <lst name="responseHeader">
>       <int name="status">0</int>
>       <int name="QTime">62</int>
>    </lst>
>    <lst name="suggest">
>       <lst name="mySuggester2a">
>          <lst name="12">
>             <int name="numFound">6</int>
>             <arr name="suggestions">
>                <lst>
>                   <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">&lt;b&gt;12&lt;/b&gt;:06:02</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>             </arr>
>          </lst>
>       </lst>
>       <lst name="mySuggester1a">
>          <lst name="12">
>             <int name="numFound">0</int>
>             <arr name="suggestions" />
>          </lst>
>       </lst>
>    </lst>
> </response>
>
> I would like to get this kind of suggester response ( no duplicates ):
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>    <lst name="responseHeader">
>       <int name="status">0</int>
>       <int name="QTime">62</int>
>    </lst>
>    <lst name="suggest">
>       <lst name="mySuggester2a">
>          <lst name="12">
>             <int name="numFound">3</int>
>             <arr name="suggestions">
>                <lst>
>                   <str name="term">18:34:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">18:35:&lt;b&gt;12&lt;/b&gt;</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>                <lst>
>                   <str name="term">&lt;b&gt;12&lt;/b&gt;:06:02</str>
>                   <long name="weight">0</long>
>                   <str name="payload" />
>                </lst>
>             </arr>
>          </lst>
>       </lst>
>       <lst name="mySuggester1a">
>          <lst name="12">
>             <int name="numFound">0</int>
>             <arr name="suggestions" />
>          </lst>
>       </lst>
>    </lst>
> </response>Thank you.
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England