You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Trey Grainger (Jira)" <ji...@apache.org> on 2019/10/08 19:47:00 UTC

[jira] [Created] (SOLR-13829) RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch

Trey Grainger created SOLR-13829:
------------------------------------

             Summary: RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch
                 Key: SOLR-13829
                 URL: https://issues.apache.org/jira/browse/SOLR-13829
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Trey Grainger


In trying to use the "sort" streaming evaluator on float field (pfloat), I am getting casting errors back based upon which values are calculated based upon underlying values in a field.

Example:

*Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")

 
{code:java}
{"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}

{"id": "2", "name":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
 

*Streaming Expression:*

 
{code:java}
sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as sim, id), by="sim desc"){code}
 

*Response:*

 
{code:java}
{ 
  "result-set": {
    "docs": [
      {
        "EXCEPTION": "class java.lang.Double cannot be cast to class java.lang.Long (java.lang.Double and java.lang.Long are in module java.base of loader 'bootstrap')",
        "EOF": true,
        "RESPONSE_TIME": 13
      }
    ]
  }
}{code}
 

 

This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, there is a line which examines a numeric (BigDecimal) value and - regardless of the type of the field the value originated from - converts it to a Long if it looks like a whole number. This is the code in question from that class:
{code:java}
protected Object normalizeOutputType(Object value) {
    if(null == value){
      return null;
    } else if (value instanceof VectorFunction) {
      return value;
    } else if(value instanceof BigDecimal){
      BigDecimal bd = (BigDecimal)value;
      if(bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0){
        try{
          return bd.longValueExact();
        }
        catch(ArithmeticException e){
          // value was too big for a long, so use a double which can handle scientific notation
        }
      }
      
      return bd.doubleValue();
    }
... [other type conversions]
{code}
Because of the *return bd.longValueExact()*; line, the calculated value for "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 2 is "Double(0.88938313). These are coming back as incompatible data types, even though the source data is all of the same type and should be comparable.

Thus when the *sort* evaluator streaming expression (and probably others) runs on these calculated values and the list should contain ["0.88938313", "1.0"], an exception is thrown because the it's trying to compare incompatible data types [Double("0.99"), Long(1)].

This bug is occurring on master currently, but has probably existed in the codebase since at least August 2017.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org