You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/21 06:51:29 UTC

[GitHub] [arrow-datafusion] src255 opened a new pull request, #3915: Simplify redundant predicates

src255 opened a new pull request, #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915

   Write rules to simplify both `a OR a` and `a AND a` to `a`.
   
   Hello! I wanted to help with issue #3895. This is my attempt, but I'm not sure if this is the desired comparison of the `left` and `right` fields of the `BinaryExpr` struct. I hope this helps and I would appreciate any feedback!
   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #3895 .
   
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] isidentical commented on pull request #3915: Simplify redundant predicates

Posted by GitBox <gi...@apache.org>.
isidentical commented on PR #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915#issuecomment-1287307982

   Thank you for the PR @src255, I think it looks pretty good as is (an example comparison is available inside `expr_contains`). I think all you have to do next is add some tests (a very similiar example is `test_simplify_optimized_plan`). One thing to note is, I think we might have `a AND a` already so might make sense to check if the test fails first and gets fixed with your revision. But `a OR a` should be still good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] isidentical commented on pull request #3915: Simplify redundant predicates

Posted by GitBox <gi...@apache.org>.
isidentical commented on PR #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915#issuecomment-1287381751

   Thanks for checking it out. That seems to be true (the simplification below) 🤔 (which probably means somewhere in the #3859 we are missing something else, cc: @alamb @Ted-Jiang).
   
   https://github.com/apache/arrow-datafusion/blob/e534c2536858cf18aac219c33b0bddef54c7f214/datafusion/optimizer/src/simplify_expressions.rs#L768-L779
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #3915: Simplify redundant predicates

Posted by GitBox <gi...@apache.org>.
alamb merged PR #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] src255 commented on pull request #3915: Simplify redundant predicates

Posted by GitBox <gi...@apache.org>.
src255 commented on PR #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915#issuecomment-1287373139

   Thanks for the feedback. Here is the new test I wrote with `test_simplify_optimized_plan` as a guide:
   ```rust
   #[test]
   fn test_simplify_optimized_plan_with_or() {
       let table_scan = test_table_scan();
       let plan = LogicalPlanBuilder::from(table_scan)
           .project(vec![col("a")])
           .unwrap()
           .filter(or(col("b").gt(lit(1)), col("b").gt(lit(1))))  // use `or` instead of `and`
           .unwrap()
           .build()
           .unwrap();
   
       assert_optimized_plan_eq(
           &plan,
           "\
           Filter: test.b > Int32(1)\
           \n  Projection: test.a\
           \n    TableScan: test",
       );
   }
   ```
   After running this test, I noticed that both `test_simplify_optimized_plan_with_or` and `test_simplify_optimized_plan` pass without my changes (`OR` and `AND` changes commented out)! Perhaps the optimizer is already making *both* simplifications: 
   - `a OR a` --> `a`
   - `a AND a` --> `a`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #3915: Simplify redundant predicates

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #3915:
URL: https://github.com/apache/arrow-datafusion/pull/3915#issuecomment-1287758050

   Thanks for the test and investigation @src255  and @isidentical -- clearly I got something wrong. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org