You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/19 15:18:59 UTC

[GitHub] [arrow-datafusion] jackwener opened a new issue #2038: Filter push down rule cause the wrong plan

jackwener opened a new issue #2038:
URL: https://github.com/apache/arrow-datafusion/issues/2038


   **Describe the bug**
   
   During I add new optimizer rule `combine_adjacent_filter`, I found that `filter push down rule cause the wrong plan` #2026.
   
   This rule will cause the combined filter expressions to be split.
   
   There is a bug in filter push down optimizer rule.
   
   ```sql
   explain verbose select c1, c2 from test where c3 = true and c2 = 0.000001;
   +-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type                                             | plan                                                                                                                                |
   +-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
   | initial_logical_plan                                  | Projection: #test.c1, #test.c2                                                                                                      |
   |                                                       |   Filter: #test.c3 = Boolean(true) AND #test.c2 = Float64(0.000001)                                                                 |
   |                                                       |     TableScan: test projection=None                                                                                                 |
   | logical_plan after simplify_expressions               | Projection: #test.c1, #test.c2                                                                                                      |
   |                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                      |
   |                                                       |     TableScan: test projection=None                                                                                                 |
   | logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after projection_push_down               | Projection: #test.c1, #test.c2                                                                                                      |
   |                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                      |
   |                                                       |     TableScan: test projection=Some([0, 1, 2])                                                                                      |
   | logical_plan after filter_push_down                   | Projection: #test.c1, #test.c2                                                                                                      |
   |                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001)                                                                                 |
   |                                                       |     TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                    |
   | logical_plan after limit_push_down                    | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after SingleDistinctAggregationToGroupBy | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after ToApproxPerc                       | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after simplify_expressions               | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after projection_push_down               | SAME TEXT AS ABOVE                                                                                                                  |
   | logical_plan after filter_push_down                   | Projection: #test.c1, #test.c2                                                                                                      |
   |                                                       |   Filter: #test.c3                                                                                                                  |
   |                                                       |     Filter: #test.c2 = Float64(0.000001)                                                                                            |
   |                                                       |       TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                  |                             |
   ```
   
   **To Reproduce**
   ```sql
   create external table test (
   c1 float,
   c2 double,
   c3 boolean
   )
   stored as csv
   with header row
   location '<!!! YOUR PATH !!!>/datafusion/tests/aggregate_simple.csv';
   ```
   ```sql
   explain select c1, c2 from test where c3 = true and c2 = 0.000001;
   +---------------+-------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                                                                                                |
   +---------------+-------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Projection: #test.c1, #test.c2                                                                                                      |
   |               |   Filter: #test.c3                                                                                                                  |
   |               |     Filter: #test.c2 = Float64(0.000001)                                                                                            |
   |               |       TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                  |
   | physical_plan | ProjectionExec: expr=[c1@0 as c1, c2@1 as c2]                                                                                       |
   |               |   CoalesceBatchesExec: target_batch_size=4096                                                                                       |
   |               |     FilterExec: c3@2                                                                                                                |
   |               |       CoalesceBatchesExec: target_batch_size=4096                                                                                   |
   |               |         FilterExec: c2@1 = 0.000001                                                                                                 |
   |               |           RepartitionExec: partitioning=RoundRobinBatch(8)                                                                          |
   |               |             CsvExec: files=[/home/jakevin/code/arrow-datafusion/datafusion/tests/aggregate_simple.csv], has_header=true, limit=None |
   |               |                                                                                                                                     |
   +---------------+-------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   
   **Expected behavior**
   Shouldn't split the expressions
   
   **Additional context**
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #2038: Filter push down rule cause the wrong plan

Posted by GitBox <gi...@apache.org>.
alamb closed issue #2038:
URL: https://github.com/apache/arrow-datafusion/issues/2038


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org