You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "abhioncbr (via GitHub)" <gi...@apache.org> on 2023/04/25 13:37:09 UTC

[GitHub] [pinot] abhioncbr opened a new issue, #10686: Incorrect float/double column comparison.

abhioncbr opened a new issue, #10686:
URL: https://github.com/apache/pinot/issues/10686

   We are ignoring a couple of [numeric comparison queries](https://github.com/apache/pinot/blob/master/pinot-query-runtime/src/test/resources/queries/Comparisons.json#L346) between `float` and `double` based columns, for example
   
   ```
   SELECT i1 != i2, i2 != i1 FROM {tbl}
   ```
   
   It's failing because of the `cast` function from `double` to `float`, hence losing precision and resulting incorrect output. The Apache calcite adds the `cast` transformation function during query parsing. This issue tracks the implementation for correctly handling float/double comparison. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1524278410

   Yes. Apache Calcite supports [implicit type conversion](https://calcite.apache.org/docs/reference.html#implicit-type-conversion) between float and double and hence depends on the left operand data type; the right operand is getting cast. 
   
   So if the LHS is `float` and RHS is `double`; the `cast` is happening from `double -> float`, hence losing precision and resulting in the wrong result.
   <img width="869" alt="Screen Shot 2023-04-26 at 8 20 24 PM" src="https://user-images.githubusercontent.com/1076944/234729152-a10c523e-235d-4dab-9250-a105b4b220d0.png">
   
   
   My initial solution approach is to remove/skip this casting function in case of comparison of float/double. What do you think about it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1524115541

   Also related to #10637 cc @elonazoulay 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #10686: Incorrect float/double column comparison.

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1590189525

   > We can close this one. Thanks
   
   FYI if you include `this fixes #xxxx` in your PR description it will automatically close the issue when PR merges


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1589880858

   We can close this one. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr closed issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr closed issue #10686: Incorrect float/double column comparison.
URL: https://github.com/apache/pinot/issues/10686


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1526229372

   > Losing precision might not be the root cause though. Even if we do float to double casting, the number won't match.
   > 
   > ```
   > System.out.println((double) 1.1f); --> 1.100000023841858
   > ```
   > 
   > Which query is giving different result comparing to H2?
   
   Well, you are following the wrong order of casting. The problematic comparison involves casting from `double -> float`.
   ```
   System.out.println((double) 1.1f); --> 1.100000023841858 (correct)
   System.out.println((float) 1.1d);  --> 1.1 (incorrect)
   
   // incorrect float/double comparison because of casting
   float f = 1.1f;
   double d = 1.1d;
   System.out.println(Float.compare(f, (float)d)); --> 0 (Such comparison in Pinot resulting incorrect output)
   ```
   
   We are ignoring the [following queries in our test cases](https://github.com/apache/pinot/blob/master/pinot-query-runtime/src/test/resources/queries/Comparisons.json#L346)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1524227971

   So seems all the DBs are doing exact match on `=` and `!=`.
   
   Do you see the cast from double to float getting added during the query parsing? Why would it cast value from double to float, but not hoisting up to double?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1524201991

   I did a bit of analysis while going over the problem itself.
   
   1. H2 database follows the exact comparison between double and float columns. Here are the results of queries in the H2 database
   ```
   $ CREATE TABLE float_x_double_comp_tbl (i1 double,i2 real);
   $ INSERT INTO float_x_double_comp_tbl VALUES(1.1,1.1);
   $ INSERT INTO float_x_double_comp_tbl VALUES(1.5,1.5);
   
   $ SELECT i1, i2, i1!=i2, i2!=i1 FROM FLOAT_X_DOUBLE_COMP_TBL
   
   
   I1  | I2  | I1 <> I2 | I2 <> I1
   ----|-----|----------|---------
   1.1 | 1.1 | TRUE     | TRUE
   1.5 | 1.5 | FALSE    | FALSE
   
   ```
   Actually, initially I thought it's a bug in H2 also and created an [issue](https://github.com/h2database/h2database/issues/3784) for this. Got nice explanation by the developer. 
   
   2. I have also checked the Postgress, and the results are the same as of the H2 database.  
   <img width="1634" alt="Screen Shot 2023-04-26 at 7 33 02 PM" src="https://user-images.githubusercontent.com/1076944/234724368-23df152d-cc52-4710-971e-c37314faa017.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1526601364

   I see, so we actually expect it to cast from float to double and since we are converting from low precision to high precision it will return not equals.
   I feel the difference is from the behavior of always casting the RHS value vs hoisting up to the same type. Can you also try the comparison for other data types (int, long, float, double, big_decimal) and see what is the expected behavior from H2 and Postgres?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1534014194

   Comparison results on H2 DB 
   ```
   $ CREATE TABLE Numeric_tbl (int_col INTEGER, long_col BIGINT, double_col Double, float_col REAL, bigDecimal_col DECFLOAT);
   $ INSERT INTO Numeric_tbl VALUES(1,1,1.1,1.1,1.1);
   $ INSERT INTO Numeric_tbl VALUES(1,1,1,1,1);
   
   $ select int_col = long_col, long_col =int_col from Numeric_tbl;
   INT_COL = LONG_COL | LONG_COL = INT_COL
   ------------------ | -------------------
   TRUE               | TRUE
   TRUE               | TRUE
   
   $ select int_col = bigDecimal_col, bigDecimal_col = int_col,long_col = bigDecimal_col, bigDecimal_col = long_col from Numeric_tbl;
   INT_COL = BIGDECIMAL_COL | BIGDECIMAL_COL = INT_COL | LONG_COL = BIGDECIMAL_COL | BIGDECIMAL_COL = LONG_COL
   ------------------------ | ------------------------ | ------------------------- | ------------------------ 
   FALSE                    | FALSE                    | FALSE                     | FALSE
   TRUE                     | TRUE                     | TRUE                      | TRUE
   
   $ select float_col = double_col, double_col = float_col from Numeric_tbl;
   FLOAT_COL = DOUBLE_COL | DOUBLE_COL = FLOAT_COL
   ---------------------- | ------------------------
   FALSE                  | FALSE
   TRUE                   | TRUE
   
   $ select float_col =  bigDecimal_col, bigDecimal_col = float_col,  double_col =  bigDecimal_col, bigDecimal_col = double_col from Numeric_tbl;
   FLOAT_COL = BIGDECIMAL_COL | BIGDECIMAL_COL = FLOAT_COL | DOUBLE_COL = BIGDECIMAL_COL | BIGDECIMAL_COL = DOUBLE_COL
   -------------------------- | -------------------------- | --------------------------- | ---------------------------
   TRUE                       | TRUE                       | TRUE                        | TRUE
   TRUE                       | TRUE                       | TRUE                        | TRUE
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1560231048

   Thanks for taking time testing these comparisons, very interesting results! So Postgres and H2 have different behavior on float - big-decimal conversion, but double - big-decimal conversion is lossless in both DBs.
   For the float - double conversion issue, I feel we should always hoist up to the higher precision type, which aligns with the H2 and Postgres behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1521806262

   I am planning to work on this. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1524114507

   We should discuss if we want to do exact comparison between floating point numbers or fuzzy comparison. I'd suggest doing some research to see what is the behavior of the commonly used DBs.
   cc @richardstartin I remember you have some suggestions to this topic
   cc @xiangfu0 @siddharthteotia @yupeng9 @walterddr @61yao @atris for some input


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10686: Incorrect float/double column comparison.

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1526115660

   Losing precision might not be the root cause though. Even if we do float to double casting, the number won't match.
   ```
   System.out.println((double) 1.1f); --> 1.100000023841858
   ```
   
   Which query is giving different result comparing to H2?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on issue #10686: Incorrect float/double column comparison.

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on issue #10686:
URL: https://github.com/apache/pinot/issues/10686#issuecomment-1562143499

   
   
   
   > Thanks for taking time testing these comparisons, very interesting results! So Postgres and H2 have different behavior on float - big-decimal conversion, but double - big-decimal conversion is lossless in both DBs. For the float - double conversion issue, I feel we should always hoist up to the higher precision type, which aligns with the H2 and Postgres behavior.
   
   +1; I'll try to implement this. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org