You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Cassandra Targett (Jira)" <ji...@apache.org> on 2019/11/07 19:29:00 UTC
[jira] [Updated] (SOLR-13838) igain query parser generating invalid
output
[ https://issues.apache.org/jira/browse/SOLR-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cassandra Targett updated SOLR-13838:
-------------------------------------
Fix Version/s: (was: 8.3)
> igain query parser generating invalid output
> --------------------------------------------
>
> Key: SOLR-13838
> URL: https://issues.apache.org/jira/browse/SOLR-13838
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: query parsers
> Affects Versions: 8.2
> Environment: The issue is a generic Java defect and therefore will be independent of the operating system or software platform.
> Reporter: Peter Davie
> Priority: Major
> Attachments: IGainTermsQParserPlugin.java.patch
>
>
> Investigating the output from the "features()" stream source, terms are being returned with NaN for the score_f field:
> {{{{ "docs": [}}}}
> {{{{ {}}}}
> {{{{ "featureSet_s": "business",}}}}
> {{{{ "score_f": "NaN",}}}}
> {{{{ "term_s": "1,011.15",}}}}
> {{{{ "idf_d": "-Infinity",}}}}
> {{{{ "index_i": 1,}}}}
> {{{{ "id": "business_1"}}}}
> {{{{ },}}}}
> {{{{ {}}}}
> {{{{ "featureSet_s": "business",}}}}
> {{{{ "score_f": "NaN",}}}}
> {{{{ "term_s": "10.3m",}}}}
> {{{{ "idf_d": "-Infinity",}}}}
> {{{{ "index_i": 2,}}}}
> {{{{ "id": "business_2"}}}}
> {{{{ },}}}}
> {{{{ {}}}}
> {{{{ "featureSet_s": "business",}}}}
> {{{{ "score_f": "NaN",}}}}
> {{{{ "term_s": "01",}}}}
> {{{{ "idf_d": "-Infinity",}}}}
> {{{{ "index_i": 3,}}}}
> {{{{ "id": "business_3"}}}}
> {{{{ },...}}}}
> Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it seems that when a term is not included in the positive or negative documents, the docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent calculations result in NaN (division by 0).
> Attached is a patch which skips terms for which docFreq
> is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the issues with NaN scores in the features() output.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org