You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "doki23 (via GitHub)" <gi...@apache.org> on 2023/04/02 07:11:46 UTC

[GitHub] [arrow-datafusion] doki23 opened a new issue, #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

doki23 opened a new issue, #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830

   ### Is your feature request related to a problem or challenge?
   
   ```sql
   ❯ explain select * from t where a > 1 and a < 1;
   +---------------+-------------------------------------------------------------------------------+
   | plan_type     | plan                                                                          |
   +---------------+-------------------------------------------------------------------------------+
   | logical_plan  | Filter: t.a > Int32(1) AND t.a < Int32(1)                                     |
   |               |   TableScan: t projection=[a, b]                                              |
   | physical_plan | CoalesceBatchesExec: target_batch_size=8192                                   |
   |               |   FilterExec: a@0 > 1 AND a@0 < 1                                             |
   |               |     MemoryExec: partitions=10, partition_sizes=[1, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
   |               |                                                                               |
   +---------------+-------------------------------------------------------------------------------+
   ```
   Datafusion needs to scan table t and execute the filter, although it's unnecessary.
   
   
   ### Describe the solution you'd like
   
   Add an optimizer rule to simplify this predicate to a constant bool value and make the logical plan equal to 'select * from t where false' like 
   ```sql
   ❯ explain select * from t where false;
   +---------------+----------------------------------+
   | plan_type     | plan                             |
   +---------------+----------------------------------+
   | logical_plan  | EmptyRelation                    |
   | physical_plan | EmptyExec: produce_one_row=false |
   |               |                                  |
   +---------------+----------------------------------+
   ```
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] doki23 commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "doki23 (via GitHub)" <gi...@apache.org>.
doki23 commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1495320987

   Alright, let's change the example to `a is not null and b is not null and a > 1 and a < 100 and b > 1 and b < 100 and a + b < 2`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1497466044

   > A related idea: May we need a easy-to-use expression rewriting framework to add some rule to rewrite expression?
   
   
   This is a neat idea @jackwener  and people have suggested that or similar things (for example, I think https://github.com/apache/arrow-datafusion/pull/1485 was one such suggestion)
   
   Having general purpose rewrite rules certainly sounds like a good idea, though I don't have much knowledge of how well the work in practice in query engines


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1495430901

   Oh, I notice this issue already specify `Predicate`, so `a > 1 and a < 1` can be folded to `false`.
   Because predicate contain implicit `not null`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1494340314

   A related idea: we need a easy-to-use expression rewriting framework to add some rule to rewrite expression.
   
   cc @mingmwang @alamb 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1494512805

   > And I'm not sure whether a > 1 and a < 1 can always fold to false, considering a might be NULL
   
   Agree with it.
   BTW, it's same with #1716


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mingmwang commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.
mingmwang commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1494441196

   And I'm not sure whether `a > 1 and a < 1`  can always fold to `false`,  considering `a` might be `NULL`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mingmwang commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.
mingmwang commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1494436387

   > A related idea: May we need a easy-to-use expression rewriting framework to add some rule to rewrite expression?
   > 
   > like Apache Doris : https://github.com/apache/doris/tree/master/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/rewrite
   > 
   > this issue is resolved by a rule: https://github.com/apache/doris/blob/master/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/rewrite/rules/SimplifyRange.java
   > 
   > cc @mingmwang @alamb
   
   @jackwener 
   I think we already have enough rewriting utils for exprs to implement such rules.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mingmwang commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.
mingmwang commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1494447713

   > Consider a little bit more complex predicate like 'a > 1 and b > 1 and (a + b) < 2'
   We can not simply fold the `a > 1 and b > 1 and (a + b) < 2` to always `false`, considering ` a + b` might overflow the data types of `Int`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] doki23 commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "doki23 (via GitHub)" <gi...@apache.org>.
doki23 commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1493254059

   Consider a little bit more complex predicate like 'a > 1 and b > 1 and (a + b) < 2'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #5830: Simplify predicate expressions like 'a > 1 and a < 1' to constant false

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5830:
URL: https://github.com/apache/arrow-datafusion/issues/5830#issuecomment-1497463640

   I think we could probably add this specific type of simplification to  the expression simplifier and it would be fine.
   
   I think a more general approach for handling arbitrary predicates with constraints, such as @doki23 describes   
   
   ```
   a is not null and b is not null and a > 1 and a < 100 and b > 1 and b < 100 and a + b < 2
   ```
   
   Is not rewrite rules but rather some form of range analysis (e.g. as described on https://github.com/apache/arrow-datafusion/issues/5535)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org