You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "jonahgao (via GitHub)" <gi...@apache.org> on 2023/09/09 17:30:14 UTC

[GitHub] [arrow-datafusion] jonahgao opened a new pull request, #7515: Fix some simplification rules for floating-point arithmetic operations

jonahgao opened a new pull request, #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515

   ## Which issue does this PR close?
   
   N/A
   
   ## Rationale for this change
   Similar to #7503. 
   
   Because of the presence of `NaN`, some simplification rules will no longer be applicable to floating-point types.
   
   
   ## What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   ## Are these changes tested?
   Yes
   
   ## Are there any user-facing changes?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #7515: Fix some simplification rules for floating-point arithmetic operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.

alamb commented on code in PR #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515#discussion_r1321893464


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -734,19 +744,33 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for Simplifier<'a, S> {
                 op: Modulo,
                 right: _,
             }) if is_null(&left) => *left,
-            // A % 1 --> 0
+            // A % 1 --> 0 (if A is not nullable and not floating, since NAN % 1 --> NAN)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
-            }) if !info.nullable(&left)? && is_one(&right) => lit(0),
-            // A % 0 --> DivideByZero Error
+            }) if !info.nullable(&left)?
+                && !info.get_data_type(&left)?.is_floating()
+                && is_one(&right) =>
+            {
+                lit(0)
+            }
+            // A % 0 --> DivideByZero Error (if A is not floating and not null)
+            // A % 0 --> NAN (if A is floating and not null)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
             }) if !info.nullable(&left)? && is_zero(&right) => {
-                return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
+                match info.get_data_type(&left)? {

Review Comment:
   I don't really understand the rationale for `float % float` --> `NaN` (rather than error)
   
   
   But on the other hand, postgres doesn't seem to support `%` on floating point values:
   
   ```
   select 1.0::float % 0::float;
   
   operator does not exist: double precision % double precision LINE 1: select 1.0::float % 0::float; ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
   
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb commented on pull request #7515: Fix some simplification rules for floating-point arithmetic operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.

alamb commented on PR #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515#issuecomment-1715771591

   Thanks for the clarification @jonahgao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] jonahgao commented on a diff in pull request #7515: Fix some simplification rules for floating-point arithmetic operations

Posted by "jonahgao (via GitHub)" <gi...@apache.org>.

jonahgao commented on code in PR #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515#discussion_r1322313872


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -734,19 +744,33 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for Simplifier<'a, S> {
                 op: Modulo,
                 right: _,
             }) if is_null(&left) => *left,
-            // A % 1 --> 0
+            // A % 1 --> 0 (if A is not nullable and not floating, since NAN % 1 --> NAN)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
-            }) if !info.nullable(&left)? && is_one(&right) => lit(0),
-            // A % 0 --> DivideByZero Error
+            }) if !info.nullable(&left)?
+                && !info.get_data_type(&left)?.is_floating()
+                && is_one(&right) =>
+            {
+                lit(0)
+            }
+            // A % 0 --> DivideByZero Error (if A is not floating and not null)
+            // A % 0 --> NAN (if A is floating and not null)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
             }) if !info.nullable(&left)? && is_zero(&right) => {
-                return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
+                match info.get_data_type(&left)? {

Review Comment:
   @alamb 
   DataFusion utilizes the `rem()` function from the arrow-rs to perform the `Modulo` operation.
   
   The modification here ensures that it behaves the same as the `rem()` function in arrow-rs, i.e., `float % 0.` --> `NAN`.
   https://github.com/apache/arrow-rs/blob/77455d48cd6609045a4728ba908123de9d0b62fd/arrow-arith/src/numeric.rs#L71-L77
   
   And in the IEEE 754-2008 standard：
   >7.2 Invalid operation 7.2.0
   For operations producing results in floating-point format, the default result of an operation that signals the
   invalid operation exception shall be a quiet NaN...
   These operations are:
   ...
   f) remainder: remainder(x, y), when y is zero or x is infinite...
   ...
   
   Ref: https://en.wikipedia.org/wiki/NaN#Operations_generating_NaN



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -734,19 +744,33 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for Simplifier<'a, S> {
                 op: Modulo,
                 right: _,
             }) if is_null(&left) => *left,
-            // A % 1 --> 0
+            // A % 1 --> 0 (if A is not nullable and not floating, since NAN % 1 --> NAN)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
-            }) if !info.nullable(&left)? && is_one(&right) => lit(0),
-            // A % 0 --> DivideByZero Error
+            }) if !info.nullable(&left)?
+                && !info.get_data_type(&left)?.is_floating()
+                && is_one(&right) =>
+            {
+                lit(0)
+            }
+            // A % 0 --> DivideByZero Error (if A is not floating and not null)
+            // A % 0 --> NAN (if A is floating and not null)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
             }) if !info.nullable(&left)? && is_zero(&right) => {
-                return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
+                match info.get_data_type(&left)? {

Review Comment:
   @alamb 
   DataFusion utilizes the `rem()` function from the arrow-rs to perform the `Modulo` operation.
   
   The modification here ensures that it behaves the same as the `rem()` function in arrow-rs, i.e., `float % 0.` --> `NAN`.
   https://github.com/apache/arrow-rs/blob/77455d48cd6609045a4728ba908123de9d0b62fd/arrow-arith/src/numeric.rs#L71-L77
   
   And in the IEEE 754-2008 standard：
   >7.2 Invalid operation 7.2.0
   For operations producing results in floating-point format, the default result of an operation that signals the
   invalid operation exception shall be a quiet NaN...
   These operations are:
   ...
   f) remainder: remainder(x, y), when y is zero or x is infinite...
   ...
   
   Ref: https://en.wikipedia.org/wiki/NaN#Operations_generating_NaN



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb merged pull request #7515: Fix some simplification rules for floating-point arithmetic operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.

alamb merged PR #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org