You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/01 17:06:03 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue #1724: Optimize Out Redundant Filter Expresions

tustvold opened a new issue #1724:
URL: https://github.com/apache/arrow-datafusion/issues/1724


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   Datafusion does not appear to eliminate FilterExec with constant filters.
   
   ```
   > explain select * from system.chunks where false;
   +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                                                                                                                                                                                                                                                                                                                                                                                            |
   +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Projection: #system.chunks.id, #system.chunks.partition_key, #system.chunks.table_name, #system.chunks.storage, #system.chunks.lifecycle_action, #system.chunks.memory_bytes, #system.chunks.object_store_bytes, #system.chunks.row_count, #system.chunks.time_of_last_access, #system.chunks.time_of_first_write, #system.chunks.time_of_last_write, #system.chunks.order                                                      |
   |               |   Filter: Boolean(false)                                                                                                                                                                                                                                                                                                                                                                                                        |
   |               |     TableScan: system.chunks projection=Some([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])                                                                                                                                                                                                                                                                                                                                            |
   | physical_plan | ProjectionExec: expr=[id@0 as id, partition_key@1 as partition_key, table_name@2 as table_name, storage@3 as storage, lifecycle_action@4 as lifecycle_action, memory_bytes@5 as memory_bytes, object_store_bytes@6 as object_store_bytes, row_count@7 as row_count, time_of_last_access@8 as time_of_last_access, time_of_first_write@9 as time_of_first_write, time_of_last_write@10 as time_of_last_write, order@11 as order] |
   |               |   CoalesceBatchesExec: target_batch_size=500                                                                                                                                                                                                                                                                                                                                                                                    |
   |               |     FilterExec: false                                                                                                                                                                                                                                                                                                                                                                                                           |
   |               |       RepartitionExec: partitioning=RoundRobinBatch(28)                                                                                                                                                                                                                                                                                                                                                                         |
   |               |         MemoryExec: partitions=1, partition_sizes=[1]                                                                                                                                                                                                                                                                                                                                                                           |
   |               |                                                                                                                                                                                                                                                                                                                                                                                                                                 |
   +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   Whilst unlikely for a user to write such a query, this can occur when constant propagation collapses a boolean expression to a constant.
   
   **Describe the solution you'd like**
   
   The logical plan should be optimized to remove `Filter: Boolean(true)` whilst preserving its children, and to remove `Filter: Boolean(false)` and all its children.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1724: Optimize Out Redundant Filter Expresions

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1724:
URL: https://github.com/apache/arrow-datafusion/issues/1724#issuecomment-1027263133


   I suggest checking for constant `false` or `null` in chains of Ands
   
   I think this could be added to the `expression_simplfy.rs` pass fairly simply


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb edited a comment on issue #1724: Optimize Out Redundant Filter Expresions

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on issue #1724:
URL: https://github.com/apache/arrow-datafusion/issues/1724#issuecomment-1027263133






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] yjshen closed issue #1724: Optimize Out Redundant Filter Expresions

Posted by GitBox <gi...@apache.org>.
yjshen closed issue #1724:
URL: https://github.com/apache/arrow-datafusion/issues/1724


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org