You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Joel Bernstein (Jira)" <ji...@apache.org> on 2019/10/11 13:07:00 UTC
[jira] [Resolved] (SOLR-13829) RecursiveEvaluator casts Continuous
numbers to Discrete Numbers, causing mismatch
[ https://issues.apache.org/jira/browse/SOLR-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Bernstein resolved SOLR-13829.
-----------------------------------
Fix Version/s: 8.3
Resolution: Resolved
> RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch
> ---------------------------------------------------------------------------------
>
> Key: SOLR-13829
> URL: https://issues.apache.org/jira/browse/SOLR-13829
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Trey Grainger
> Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13829.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In trying to use the "sort" streaming evaluator on float field (pfloat), I am getting casting errors back based upon which values are calculated based upon underlying values in a field.
> Example:
> *Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")
>
> {code:java}
> {"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}
> {"id": "2", "name":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
>
> *Streaming Expression:*
>
> {code:java}
> sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as sim, id), by="sim desc"){code}
>
> *Response:*
>
> {code:java}
> {
> "result-set": {
> "docs": [
> {
> "EXCEPTION": "class java.lang.Double cannot be cast to class java.lang.Long (java.lang.Double and java.lang.Long are in module java.base of loader 'bootstrap')",
> "EOF": true,
> "RESPONSE_TIME": 13
> }
> ]
> }
> }{code}
>
>
> This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, there is a line which examines a numeric (BigDecimal) value and - regardless of the type of the field the value originated from - converts it to a Long if it looks like a whole number. This is the code in question from that class:
> {code:java}
> protected Object normalizeOutputType(Object value) {
> if(null == value){
> return null;
> } else if (value instanceof VectorFunction) {
> return value;
> } else if(value instanceof BigDecimal){
> BigDecimal bd = (BigDecimal)value;
> if(bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0){
> try{
> return bd.longValueExact();
> }
> catch(ArithmeticException e){
> // value was too big for a long, so use a double which can handle scientific notation
> }
> }
>
> return bd.doubleValue();
> }
> ... [other type conversions]
> {code}
> Because of the *return bd.longValueExact()*; line, the calculated value for "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 2 is "Double(0.88938313). These are coming back as incompatible data types, even though the source data is all of the same type and should be comparable.
> Thus when the *sort* evaluator streaming expression (and probably others) runs on these calculated values and the list should contain ["0.88938313", "1.0"], an exception is thrown because the it's trying to compare incompatible data types [Double("0.99"), Long(1)].
> This bug is occurring on master currently, but has probably existed in the codebase since at least August 2017.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org