You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Cassandra Targett (Jira)" <ji...@apache.org> on 2019/11/07 19:29:00 UTC

[jira] [Updated] (SOLR-13838) igain query parser generating invalid output

     [ https://issues.apache.org/jira/browse/SOLR-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cassandra Targett updated SOLR-13838:
-------------------------------------
    Fix Version/s:     (was: 8.3)

> igain query parser generating invalid output
> --------------------------------------------
>
>                 Key: SOLR-13838
>                 URL: https://issues.apache.org/jira/browse/SOLR-13838
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 8.2
>         Environment: The issue is a generic Java defect and therefore will be independent of the operating system or software platform.
>            Reporter: Peter Davie
>            Priority: Major
>         Attachments: IGainTermsQParserPlugin.java.patch
>
>
> Investigating the output from the "features()" stream source, terms are being returned with NaN for the score_f field:
> {{{{    "docs": [}}}}
> {{{{      {}}}}
> {{{{        "featureSet_s": "business",}}}}
> {{{{        "score_f": "NaN",}}}}
> {{{{        "term_s": "1,011.15",}}}}
> {{{{        "idf_d": "-Infinity",}}}}
> {{{{        "index_i": 1,}}}}
> {{{{        "id": "business_1"}}}}
> {{{{      },}}}}
> {{{{      {}}}}
> {{{{        "featureSet_s": "business",}}}}
> {{{{        "score_f": "NaN",}}}}
> {{{{        "term_s": "10.3m",}}}}
> {{{{        "idf_d": "-Infinity",}}}}
> {{{{        "index_i": 2,}}}}
> {{{{        "id": "business_2"}}}}
> {{{{      },}}}}
> {{{{      {}}}}
> {{{{        "featureSet_s": "business",}}}}
> {{{{        "score_f": "NaN",}}}}
> {{{{        "term_s": "01",}}}}
> {{{{        "idf_d": "-Infinity",}}}}
> {{{{        "index_i": 3,}}}}
> {{{{        "id": "business_3"}}}}
> {{{{      },...}}}}
> Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it seems that when a term is not included in the positive or negative documents, the docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent calculations result in NaN (division by 0).
> Attached is a patch which skips terms for which docFreq
> is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the issues with NaN scores in the features() output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org