You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Marius Grama (JIRA)" <ji...@apache.org> on 2015/05/15 15:15:01 UTC

[jira] [Comment Edited] (SOLR-7086) Suggester dictionary empty if payload field does not exist

    [ https://issues.apache.org/jira/browse/SOLR-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545459#comment-14545459 ] 

Marius Grama edited comment on SOLR-7086 at 5/15/15 1:14 PM:
-------------------------------------------------------------

I could reproduce the issue by using the sample suggester from the example techproducts collection.
{code:title=solrconfig.xml configuration}
searchComponent name="suggest" class="solr.SuggestComponent" 
                   enable="${solr.suggester.enabled:false}"     >
    <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">FuzzyLookupFactory</str>      
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>
      <str name="field">cat</str>
      <str name="weightField">price</str>
      <!-- <str name="payloadField">not_exist_s</str> -->
      <str name="suggestAnalyzerFieldType">string</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler" 
                  startup="lazy" enable="${solr.suggester.enabled:false}" >
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">10</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>
{code}

I used the request http://localhost:8983/solr/techproducts/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&wt=json&suggest.q=elec for reproducing the issue.

When the payload field is not specified (like in the configuration example above), the result is :
{code}
{
    "responseHeader": {
        "status": 0,
        "QTime": 21
    },
    "command": "build",
    "suggest": {
        "mySuggester": {
            "elec": {
                "numFound": 3,
                "suggestions": [
                    {
                        "term": "electronics and computer1",
                        "weight": 2199,
                        "payload": ""
                    },
                    {
                        "term": "electronics",
                        "weight": 649,
                        "payload": ""
                    },
                    {
                        "term": "electronics and stuff2",
                        "weight": 279,
                        "payload": ""
                    }
                ]
            }
        }
    }
}
{code}
On the other hand, when the unexisting payload field is specified (uncommented in the configuration example above) the result is empty:

{code}
{
    "responseHeader": {
        "status": 0,
        "QTime": 7
    },
    "command": "build",
    "suggest": {
        "mySuggester": {
            "elec": {
                "numFound": 0,
                "suggestions": []
            }
        }
    }
}
{code}

[~janhoy] the issue that you reported has actually to do with the class DocumentDictionary.DocumentInputIterator which simply skips documents that have a null payload.
{code:language=java,title=DocumentDictionary.java}
        if (hasPayloads) {
          StorableField payload = doc.getField(payloadField);
          if (payload == null) {
            continue;
          } else if (payload.binaryValue() != null) {
            tempPayload =  payload.binaryValue();
          } else if (payload.stringValue() != null) {
            tempPayload = new BytesRef(payload.stringValue());
          } else {
            continue;
          }
        } else {
          tempPayload = null;
        } 
{code}

Since the payload is an optional attribute, I agree with [~janhoy] that this null-check is affecting the validity of the suggest scenarios.

[~mikemccand], you've written the first version of the class DocumentDictionary and up until the last version of the implementation the check whether the payload is null is being kept. Can the null-check for the payload be modified, without breaking any existing functionality, so that null payloads are also accepted when the field _payload_ field is specified for the suggester?



was (Author: mariusneo):
I could reproduce the issue by using the sample suggester from the example techproducts collection.
{code:title=solrconfig.xml configuration}
searchComponent name="suggest" class="solr.SuggestComponent" 
                   enable="${solr.suggester.enabled:false}"     >
    <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">FuzzyLookupFactory</str>      
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>
      <str name="field">cat</str>
      <str name="weightField">price</str>
      <!-- <str name="payloadField">not_exist_s</str> -->
      <str name="suggestAnalyzerFieldType">string</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler" 
                  startup="lazy" enable="${solr.suggester.enabled:false}" >
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">10</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>
{code}

I used the request http://localhost:8983/solr/techproducts/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&wt=json&suggest.q=elec for reproducing the issue.

When the payload field is not specified (like in the configuration example above), the result is :
{code}
{
    "responseHeader": {
        "status": 0,
        "QTime": 21
    },
    "command": "build",
    "suggest": {
        "mySuggester": {
            "elec": {
                "numFound": 3,
                "suggestions": [
                    {
                        "term": "electronics and computer1",
                        "weight": 2199,
                        "payload": ""
                    },
                    {
                        "term": "electronics",
                        "weight": 649,
                        "payload": ""
                    },
                    {
                        "term": "electronics and stuff2",
                        "weight": 279,
                        "payload": ""
                    }
                ]
            }
        }
    }
}
{code}
On the other hand, when the unexisting payload field is specified (uncommented in the configuration example above) the result is empty:

{code}
{
    "responseHeader": {
        "status": 0,
        "QTime": 7
    },
    "command": "build",
    "suggest": {
        "mySuggester": {
            "elec": {
                "numFound": 0,
                "suggestions": []
            }
        }
    }
}
{code}

[~janhoy] the issue that you reported has actually to do with the class DocumentDictionary.DocumentInputIterator which simply skips documents that have a null payload.
{code:language=java,title=DocumentDictionary.java}
        if (hasPayloads) {
          StorableField payload = doc.getField(payloadField);
          if (payload == null) {
            continue;
          } else if (payload.binaryValue() != null) {
            tempPayload =  payload.binaryValue();
          } else if (payload.stringValue() != null) {
            tempPayload = new BytesRef(payload.stringValue());
          } else {
            continue;
          }
        } else {
          tempPayload = null;
        } 
{code}

Since the payload is an optional attribute, I agree with [~janhoy] that this null-check is affecting the validity of the suggest scenarios.

[~mikemccand], you've written the first version of the class DocumentDictionary and up until the last version of the implementation the check whether the payload is null is being kept. Can the null-check for the payload be modified so that null payloads are also accepted when the field _payload_ field is specified for the suggester?


> Suggester dictionary empty if payload field does not exist
> ----------------------------------------------------------
>
>                 Key: SOLR-7086
>                 URL: https://issues.apache.org/jira/browse/SOLR-7086
>             Project: Solr
>          Issue Type: Bug
>          Components: Suggester
>    Affects Versions: 4.10.3
>            Reporter: Jan Høydahl
>             Fix For: Trunk, 5.2
>
>
> Setting up a suggester for authors using {{DocumentDictionaryFactory}} and {{payloadField}}. If no documents have data in the payload field, the dictionary will be empty.
> {code:xml}
>    <lst name="suggester">
>       <str name="name">authors</str>
>       <str name="lookupImpl">BlendedInfixLookupFactory</str>     
>       <str name="suggestAnalyzerFieldType">text_folded</str>
>       <str name="dictionaryImpl">DocumentDictionaryFactory</str>     
>       <str name="field">authors_s</str>
>       <str name="payloadField">not_exist_s</str>
>    </lst>
> {code}
> It should use an empty payload instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org