You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/02/27 18:08:43 UTC

[GitHub] [arrow-datafusion] izveigor opened a new pull request, #5423: feat: add optimization rules for bitwise operations

izveigor opened a new pull request, #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423

   Closes #5422


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1128500389


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1643,522 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());

Review Comment:
   fair enough



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125499059


##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +210,31 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())
+}
+
+/// Returns a new int type based on data type of an expression
+pub fn new_int_by_expr_data_type(val: i64, expr_data_type: &DataType) -> Expr {

Review Comment:
   The behavior of the function is affected by the following PR: https://github.com/apache/arrow-datafusion/pull/5476



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1128528759


##########
datafusion/core/tests/simplification.rs:
##########
@@ -49,6 +49,13 @@ impl SimplifyInfo for MyInfo {
     fn execution_props(&self) -> &ExecutionProps {
         &self.execution_props
     }
+
+    fn get_data_type(&self, expr: &Expr) -> Result<DataType> {
+        match expr.get_type(&self.schema) {

Review Comment:
   I think this can be expressed slightly more concisely -- see https://github.com/apache/arrow-datafusion/pull/5510



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1128533485


##########
datafusion/core/tests/simplification.rs:
##########
@@ -49,6 +49,13 @@ impl SimplifyInfo for MyInfo {
     fn execution_props(&self) -> &ExecutionProps {
         &self.execution_props
     }
+
+    fn get_data_type(&self, expr: &Expr) -> Result<DataType> {
+        match expr.get_type(&self.schema) {

Review Comment:
   Agree, this code is redundant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#issuecomment-1454193467

   I plan to review this PR tomorrow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1128588329


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1643,522 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());

Review Comment:
   Here is a PR to refactor most of the tests in this module to use the less verbose style. Could you please review it when you have time? https://github.com/apache/arrow-datafusion/pull/5511 Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1120767458


##########
datafusion/expr/src/expr.rs:
##########
@@ -639,6 +639,31 @@ impl Expr {
         binary_expr(self, Operator::Or, other)
     }
 
+    /// Return `self & other`

Review Comment:
   👍 



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,

Review Comment:
   I think this rule also has to check that `A` is non nullable as `null & 0` is `null` (not `0`, which is what this code does)
   
   ```sql
   postgres=# select NULL & 0 ;
    ?column?
   ----------
   
   (1 row)
   ```
   
   The same comment applies to `0 & A` as well



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1629,361 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null ^ A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseXor, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_shift_right_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A >> null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftRight, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null >> A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseShiftRight, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_shift_left_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A << null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftLeft, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null << A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseShiftLeft, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_and_by_zero() {
+        // A & 0 --> 0
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseAnd, lit(0));
+            assert_eq!(simplify(expr), lit(0));
+        }
+        // 0 & A --> 0
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseAnd, col("c2"));
+            assert_eq!(simplify(expr), lit(0));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_or_by_zero() {
+        // A | 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseOr, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+        // 0 | A --> A
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseOr, col("c2"));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_xor_by_zero() {
+        // A ^ 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+        // 0 ^ A --> A
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseXor, col("c2"));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_bitwise_shift_right_by_zero() {
+        // A >> 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftRight, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_bitwise_shift_left_by_zero() {
+        // A << 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftLeft, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_and_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A & null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseAnd, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null & A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseAnd, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_composed_bitwise_and() {
+        // ((c > 5) & (c1 < 6)) & (c > 5) --> ((c > 5) & c1)
+
+        let expr = binary_expr(
+            binary_expr(
+                col("c2").gt(lit(5)),
+                Operator::BitwiseAnd,
+                col("c1").lt(lit(6)),
+            ),
+            Operator::BitwiseAnd,
+            col("c2").gt(lit(5)),
+        );
+        let expected = binary_expr(
+            col("c2").gt(lit(5)),
+            Operator::BitwiseAnd,
+            col("c1").lt(lit(6)),
+        );
+
+        assert_eq!(simplify(expr), expected);
+    }
+
+    #[test]
+    fn test_simplify_composed_bitwise_or() {
+        // ((c > 5) | (c1 < 6)) | (c > 5) --> ((c > 5) | c1)

Review Comment:
   ```suggestion
           // ((c2 > 5) | (c1 < 6)) | (c2 > 5) --> ((c2 > 5) | c1)
   ```



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,
+
+            // 0 & A -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&right) => *left,
+
+            // 0 | A -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&left) => *right,
+
+            // !A | A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))
+            }
+
+            // A | !A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))
+            }
+
+            // (..A..) | A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&left, &right, BitwiseOr) => *left,
+
+            // A | (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&left, &right, BitwiseOr) => *left,
+
+            // A | (A & B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseAnd, &right, &left) => {
+                *left
+            }
+
+            // (A & B) | A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseAnd, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseXor
+            //
+
+            // A ^ null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseXor,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null ^ A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A ^ 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_zero(&right) => *left,
+
+            // 0 ^ A -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_zero(&left) => *right,
+
+            // !A ^ A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))
+            }
+
+            // A ^ !A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))

Review Comment:
   needs the right `ScalarValue` variant



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,
+
+            // 0 & A -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&right) => *left,
+
+            // 0 | A -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&left) => *right,
+
+            // !A | A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))
+            }
+
+            // A | !A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))
+            }
+
+            // (..A..) | A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&left, &right, BitwiseOr) => *left,
+
+            // A | (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&left, &right, BitwiseOr) => *left,
+
+            // A | (A & B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseAnd, &right, &left) => {
+                *left
+            }
+
+            // (A & B) | A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseAnd, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseXor
+            //
+
+            // A ^ null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseXor,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null ^ A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A ^ 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_zero(&right) => *left,
+
+            // 0 ^ A -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_zero(&left) => *right,
+
+            // !A ^ A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))

Review Comment:
   needs the right `ScalarValue` variant



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,
+
+            // 0 & A -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&right) => *left,
+
+            // 0 | A -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&left) => *right,
+
+            // !A | A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(-1)))

Review Comment:
   Likewise I think can't always use `ScalarValue::Int32` -- it needs to use the appropriate variant of `ScalarValue` for the type of left/right



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,
+
+            // 0 & A -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))

Review Comment:
   It seems like this should not always be `ScalarValue::Int32` -- if `left` and `right` are `DataType::Int64` for example, I think this should be `ScalarValue::Int64(Some(0))`. This applies to the other rules as well.
   
   Perhaps we could add a method like `ScalarValue::new_zero(data_type: DataType)` to create such a constant
   
   
   
   



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +701,288 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_zero(&right) => *right,
+
+            // 0 & A -> 0
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                Expr::Literal(ScalarValue::Int32(Some(0)))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_zero(&right) => *left,

Review Comment:
   same thing here -- I think it only applies if `right` is not `NULL`



##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -77,6 +78,80 @@ pub fn expr_contains(expr: &Expr, needle: &Expr, search_op: Operator) -> bool {
     }
 }
 
+/// This enum is needed for the function "delete_expressions_in_complex_expr" for

Review Comment:
   ```suggestion
   /// This enum is needed for the function "delete_xor_in_complex_expr" for
   ```



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1629,361 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null ^ A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseXor, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_shift_right_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A >> null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftRight, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null >> A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseShiftRight, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_shift_left_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A << null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftLeft, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null << A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseShiftLeft, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_and_by_zero() {
+        // A & 0 --> 0
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseAnd, lit(0));
+            assert_eq!(simplify(expr), lit(0));
+        }
+        // 0 & A --> 0
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseAnd, col("c2"));
+            assert_eq!(simplify(expr), lit(0));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_or_by_zero() {
+        // A | 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseOr, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+        // 0 | A --> A
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseOr, col("c2"));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_xor_by_zero() {
+        // A ^ 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+        // 0 ^ A --> A
+        {
+            let expr = binary_expr(lit(0), Operator::BitwiseXor, col("c2"));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_bitwise_shift_right_by_zero() {
+        // A >> 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftRight, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_bitwise_shift_left_by_zero() {
+        // A << 0 --> A
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseShiftLeft, lit(0));
+            assert_eq!(simplify(expr), col("c2"));
+        }
+    }
+
+    #[test]
+    fn test_simplify_bitwise_and_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A & null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseAnd, null.clone());
+            assert_eq!(simplify(expr), null);
+        }
+        // null & A --> null
+        {
+            let expr = binary_expr(null.clone(), Operator::BitwiseAnd, col("c2"));
+            assert_eq!(simplify(expr), null);
+        }
+    }
+
+    #[test]
+    fn test_simplify_composed_bitwise_and() {
+        // ((c > 5) & (c1 < 6)) & (c > 5) --> ((c > 5) & c1)

Review Comment:
   I think this should refer to c1 and c2 (rather than c and c1)
   ```
           // ((c2 > 5) & (c1 < 6)) & (c2 > 5) --> ((c2 > 5) & c1)
   ```
   
   It might be valuable to add a test for the other grouping as well:
   
   ```
           // (c > 5) & ((c1 < 6) & (c > 5)) --> c1 & (c > 5)
   ```



##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +229,16 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())

Review Comment:
   Shouldn't this be checking for `Expr::Not` rather than `Expr::Negative`?
   
   ```suggestion
       matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125686357


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1643,522 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());

Review Comment:
   All 'nullable' tests in expr_simplifier have the same style, I think it can break convention.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125504313


##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +229,16 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())

Review Comment:
   Tests with ‘Expr::Not’ do not pass.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125499284


##########
datafusion/optimizer/src/simplify_expressions/context.rs:
##########
@@ -123,6 +126,18 @@ impl<'a> SimplifyInfo for SimplifyContext<'a> {
             })
     }
 
+    /// Returns data type of this expr needed for determining optimized int type of a value
+    fn get_data_type(&self, expr: &Expr) -> DataType {

Review Comment:
   Should Int32 be the default value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125500622


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +705,298 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *right,
+
+            // 0 & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))

Review Comment:
   I think `ScalarValue::new_zero` might be better here
   
   https://github.com/apache/arrow-datafusion/blob/e9852074bacd8c891d84eba38b3417aa16a2d18c/datafusion/common/src/scalar.rs#L1023-L1043
   
   



##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -77,6 +78,61 @@ pub fn expr_contains(expr: &Expr, needle: &Expr, search_op: Operator) -> bool {
     }
 }
 
+/// Deletes all 'needles' or remains one 'needle' that are found in a chain of xor
+/// expressions. Such as: A ^ (A ^ (B ^ A))
+pub fn delete_xor_in_complex_expr(expr: &Expr, needle: &Expr, is_left: bool) -> Expr {

Review Comment:
   I wonder if you could implement this using `ExprRewriter`: https://github.com/apache/arrow-datafusion/blob/e9852074bacd8c891d84eba38b3417aa16a2d18c/datafusion/expr/src/utils.rs#L519
   
   🤔 



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1346,6 +1643,522 @@ mod tests {
         assert_eq!(simplify(expr), expected);
     }
 
+    #[test]
+    fn test_simplify_bitwise_xor_by_null() {
+        let null = Expr::Literal(ScalarValue::Null);
+        // A ^ null --> null
+        {
+            let expr = binary_expr(col("c2"), Operator::BitwiseXor, null.clone());

Review Comment:
   Is there a reason you didn't use the new expr_fn's like `bitwise_xor` that were added in datafusion/expr/src/expr.rs ? I think these tessts would be significantly more concise if you did



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +705,298 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *right,
+
+            // 0 & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))

Review Comment:
   I think you could perhaps add a either `new_negative_one` or  `new_int(val: i64, data_type: &DataType) -> Result<ScalarValue>` to handle the -1 case



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +705,298 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *right,
+
+            // 0 & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A (if A not nullable)

Review Comment:
   I think `A | 0 -> A` works even if A is nullable



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -700,6 +705,298 @@ impl<'a, S: SimplifyInfo> ExprRewriter for Simplifier<'a, S> {
                 return Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
             }
 
+            //
+            // Rules for BitwiseAnd
+            //
+
+            // A & null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseAnd,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null & A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A & 0 -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *right,
+
+            // 0 & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *left,
+
+            // !A & A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))
+            }
+
+            // A & !A -> 0 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                new_int_by_expr_data_type(0, &info.get_data_type(&left))
+            }
+
+            // (..A..) & A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&left, &right, BitwiseAnd) => *left,
+
+            // A & (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if expr_contains(&right, &left, BitwiseAnd) => *right,
+
+            // A & (A | B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseOr, &right, &left) => {
+                *left
+            }
+
+            // (A | B) & A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseAnd,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseOr, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseOr
+            //
+
+            // A | null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseOr,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null | A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A | 0 -> A (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *left,
+
+            // 0 | A -> A (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *right,
+
+            // !A | A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(-1, &info.get_data_type(&left))
+            }
+
+            // A | !A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                new_int_by_expr_data_type(-1, &info.get_data_type(&left))
+            }
+
+            // (..A..) | A --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&left, &right, BitwiseOr) => *left,
+
+            // A | (..A..) --> (..A..)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if expr_contains(&right, &left, BitwiseOr) => *right,
+
+            // A | (A & B) --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&right)? && is_op_with(BitwiseAnd, &right, &left) => {
+                *left
+            }
+
+            // (A & B) | A --> A (if B not null)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseOr,
+                right,
+            }) if !info.nullable(&left)? && is_op_with(BitwiseAnd, &left, &right) => {
+                *right
+            }
+
+            //
+            // Rules for BitwiseXor
+            //
+
+            // A ^ null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseXor,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null ^ A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A ^ 0 -> A (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if !info.nullable(&left)? && is_zero(&right) => *left,
+
+            // 0 ^ A -> A (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if !info.nullable(&right)? && is_zero(&left) => *right,
+
+            // !A ^ A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_negative_of(&left, &right) && !info.nullable(&right)? => {
+                new_int_by_expr_data_type(-1, &info.get_data_type(&left))
+            }
+
+            // A ^ !A -> -1 (if A not nullable)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if is_negative_of(&right, &left) && !info.nullable(&left)? => {
+                new_int_by_expr_data_type(-1, &info.get_data_type(&left))
+            }
+
+            // (..A..) ^ A --> (the expression without A, if number of A is odd, otherwise one A)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if expr_contains(&left, &right, BitwiseXor) => {
+                let expr = delete_xor_in_complex_expr(&left, &right, false);
+                if expr == *right {
+                    new_int_by_expr_data_type(0, &info.get_data_type(&right))
+                } else {
+                    expr
+                }
+            }
+
+            // A ^ (..A..) --> (the expression without A, if number of A is odd, otherwise one A)
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseXor,
+                right,
+            }) if expr_contains(&right, &left, BitwiseXor) => {
+                let expr = delete_xor_in_complex_expr(&right, &left, true);
+                if expr == *left {
+                    new_int_by_expr_data_type(0, &info.get_data_type(&left))
+                } else {
+                    expr
+                }
+            }
+
+            //
+            // Rules for BitwiseShiftRight
+            //
+
+            // A >> null -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left: _,
+                op: BitwiseShiftRight,
+                right,
+            }) if is_null(&right) => *right,
+
+            // null >> A -> null
+            Expr::BinaryExpr(BinaryExpr {
+                left,
+                op: BitwiseShiftRight,
+                right: _,
+            }) if is_null(&left) => *left,
+
+            // A >> 0 -> A (if A not nullable)

Review Comment:
   I think this still works even if A is nullable too



##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +210,31 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())
+}
+
+/// Returns a new int type based on data type of an expression
+pub fn new_int_by_expr_data_type(val: i64, expr_data_type: &DataType) -> Expr {

Review Comment:
   does it need to be pub?



##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +229,16 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())

Review Comment:
   Any thoughts here @izveigor ?



##########
datafusion/optimizer/src/simplify_expressions/context.rs:
##########
@@ -123,6 +126,18 @@ impl<'a> SimplifyInfo for SimplifyContext<'a> {
             })
     }
 
+    /// Returns data type of this expr needed for determining optimized int type of a value
+    fn get_data_type(&self, expr: &Expr) -> DataType {

Review Comment:
   I think this should simply return the error if the data type can not be determined -- it seems wrong to treat it as a 32 bit number 🤔 . 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] izveigor commented on pull request #5423: feat: add optimization rules for bitwise operations

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#issuecomment-1452157461

   Hello, @alamb!
   Thanks to your response, I took into account your comments and fixed all defects. I hope this PR will be merged into main soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423#discussion_r1125561391


##########
datafusion/optimizer/src/simplify_expressions/utils.rs:
##########
@@ -154,11 +229,16 @@ pub fn is_op_with(target_op: Operator, haystack: &Expr, needle: &Expr) -> bool {
     matches!(haystack, Expr::BinaryExpr(BinaryExpr { left, op, right }) if op == &target_op && (needle == left.as_ref() || needle == right.as_ref()))
 }
 
-/// returns true if `not_expr` is !`expr`
+/// returns true if `not_expr` is !`expr` (not)
 pub fn is_not_of(not_expr: &Expr, expr: &Expr) -> bool {
     matches!(not_expr, Expr::Not(inner) if expr == inner.as_ref())
 }
 
+/// returns true if `not_expr` is !`expr` (bitwise not)
+pub fn is_negative_of(not_expr: &Expr, expr: &Expr) -> bool {
+    matches!(not_expr, Expr::Negative(inner) if expr == inner.as_ref())

Review Comment:
   Sorry -- somehow I misread this function name and was confused. I agree this looks good



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5423: feat: add optimization rules for bitwise operations

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5423:
URL: https://github.com/apache/arrow-datafusion/pull/5423


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org