You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Trey Grainger (Jira)" <ji...@apache.org> on 2019/10/08 19:47:00 UTC
[jira] [Created] (SOLR-13829) RecursiveEvaluator casts Continuous
numbers to Discrete Numbers, causing mismatch
Trey Grainger created SOLR-13829:
------------------------------------
Summary: RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch
Key: SOLR-13829
URL: https://issues.apache.org/jira/browse/SOLR-13829
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Trey Grainger
In trying to use the "sort" streaming evaluator on float field (pfloat), I am getting casting errors back based upon which values are calculated based upon underlying values in a field.
Example:
*Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")
{code:java}
{"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}
{"id": "2", "name":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
*Streaming Expression:*
{code:java}
sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as sim, id), by="sim desc"){code}
*Response:*
{code:java}
{
"result-set": {
"docs": [
{
"EXCEPTION": "class java.lang.Double cannot be cast to class java.lang.Long (java.lang.Double and java.lang.Long are in module java.base of loader 'bootstrap')",
"EOF": true,
"RESPONSE_TIME": 13
}
]
}
}{code}
This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, there is a line which examines a numeric (BigDecimal) value and - regardless of the type of the field the value originated from - converts it to a Long if it looks like a whole number. This is the code in question from that class:
{code:java}
protected Object normalizeOutputType(Object value) {
if(null == value){
return null;
} else if (value instanceof VectorFunction) {
return value;
} else if(value instanceof BigDecimal){
BigDecimal bd = (BigDecimal)value;
if(bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0){
try{
return bd.longValueExact();
}
catch(ArithmeticException e){
// value was too big for a long, so use a double which can handle scientific notation
}
}
return bd.doubleValue();
}
... [other type conversions]
{code}
Because of the *return bd.longValueExact()*; line, the calculated value for "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 2 is "Double(0.88938313). These are coming back as incompatible data types, even though the source data is all of the same type and should be comparable.
Thus when the *sort* evaluator streaming expression (and probably others) runs on these calculated values and the list should contain ["0.88938313", "1.0"], an exception is thrown because the it's trying to compare incompatible data types [Double("0.99"), Long(1)].
This bug is occurring on master currently, but has probably existed in the codebase since at least August 2017.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org