You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Audrey Lorberfeld - Audrey.Lorberfeld@ibm.com" <Au...@ibm.com> on 2020/03/04 21:22:39 UTC

exactMatchFirst Solr Suggestion Component

 Hi All,

Would anyone be able to help me debug my suggestion component? Right now, our config looks like this: 

<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">mySuggester</str>
    <str name="lookupImpl">FuzzyLookupFactory</str>
    <str name="dictionaryImpl">FileDictionaryFactory</str>
    <str name="sourceLocation">./conf/queries_list_with_weights.txt</str>
    <str name="fieldDelimiter">,</str>
    <str name="storeDir">conf</str>
    <str name="suggestAnalyzerFieldType">keywords_w3_en</str>
    <str name="buildOnStartup">false</str>
  </lst>
</searchComponent>

We like the idea of the FuzzyLookupFactory because of how it interacts with misspelled prefixes. However, we are finding that the exactMatchFirst parameter, which is supposed to be set to true by default in the code, is NOT showing exact match prefixes first. I think this is because of the weights we have with each term. However, the documentation specifically states that exactMatchFirst is meant to ignore weights (https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-8.x/javadoc/suggester.html#fuzzylookupfactory). 

For the prefix "box" this is what our suggestions list looks like. You can see that "bond" is above other results I would expect to be above it, such as "box@ibm," etc.:

{
  "responseHeader":{
    "status":0,
    "QTime":112},
  "command":"build",
  "suggest":{"mySuggester":{
      "box":{
        "numFound":8,
        "suggestions":[{
            "term":"box",
            "weight":1799,
            "payload":""},
          {
            "term":"bond",
            "weight":805,
            "payload":""},
          {
            "term":"box@ibm",
            "weight":202,
            "payload":""},
          {
            "term":"box at ibm",
            "weight":54,
            "payload":""},
          {
            "term":"books",
            "weight":45,
            "payload":""},
          {
            "term":"box drive",
            "weight":34,
            "payload":""},
          {
            "term":"books 24x7",
            "weight":31,
            "payload":""},
          {
            "term":"box sync",
            "weight":31,
            "payload":""}]}}}}

Any help is greatly appreciated!

Best,
Audrey