You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/11/22 17:23:00 UTC
[jira] [Closed] (ARROW-10686) [Rust] [DataFusion] Combine
conjunctive filters
[ https://issues.apache.org/jira/browse/ARROW-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andy Grove closed ARROW-10686.
------------------------------
Resolution: Invalid
This is already implemented
> [Rust] [DataFusion] Combine conjunctive filters
> -----------------------------------------------
>
> Key: ARROW-10686
> URL: https://issues.apache.org/jira/browse/ARROW-10686
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust - DataFusion
> Reporter: Andy Grove
> Priority: Minor
>
> When using the DataFrame API, it is natural to chain together filter operations like this:
> {code:java}
> .filter(col("l_commitdate").lt(col("l_receiptdate")))?
> .filter(col("l_shipdate").lt(col("l_commitdate")))?
> .filter(col("l_receiptdate").gt_eq(lit("1994-01-01")))?
> .filter(col("l_receiptdate").lt(lit("1995-01-01")))?{code}
> This results in the following plan:
> {code:java}
> Filter: #l_receiptdate Lt Utf8("1995-01-01")
> Filter: #l_receiptdate GtEq Utf8("1994-01-01")
> Filter: #l_shipdate Lt #l_commitdate
> Filter: #l_commitdate Lt #l_receiptdate{code}
> We could implement an optimizer rule that combines these into a single filter:
> {code:java}
> Filter: #l_receiptdate Lt Utf8("1995-01-01") AND #l_receiptdate GtEq Utf8("1994-01-01") AND #l_shipdate Lt #l_commitdate AND #l_commitdate Lt #l_receiptdate {code}
> This will lead to a more concise plan and possibly will reduce some overhead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)