You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/11/22 17:23:00 UTC

[jira] [Closed] (ARROW-10686) [Rust] [DataFusion] Combine conjunctive filters

     [ https://issues.apache.org/jira/browse/ARROW-10686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Grove closed ARROW-10686.
------------------------------
    Resolution: Invalid

This is already implemented

> [Rust] [DataFusion] Combine conjunctive filters
> -----------------------------------------------
>
>                 Key: ARROW-10686
>                 URL: https://issues.apache.org/jira/browse/ARROW-10686
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust - DataFusion
>            Reporter: Andy Grove
>            Priority: Minor
>
> When using the DataFrame API, it is natural to chain together filter operations like this:
> {code:java}
> .filter(col("l_commitdate").lt(col("l_receiptdate")))?
> .filter(col("l_shipdate").lt(col("l_commitdate")))?
> .filter(col("l_receiptdate").gt_eq(lit("1994-01-01")))?
> .filter(col("l_receiptdate").lt(lit("1995-01-01")))?{code}
> This results in the following plan:
> {code:java}
>     Filter: #l_receiptdate Lt Utf8("1995-01-01")
>       Filter: #l_receiptdate GtEq Utf8("1994-01-01")
>         Filter: #l_shipdate Lt #l_commitdate
>           Filter: #l_commitdate Lt #l_receiptdate{code}
> We could implement an optimizer rule that combines these into a single filter:
> {code:java}
> Filter: #l_receiptdate Lt Utf8("1995-01-01") AND #l_receiptdate GtEq Utf8("1994-01-01") AND #l_shipdate Lt #l_commitdate AND #l_commitdate Lt #l_receiptdate  {code}
> This will lead to a more concise plan and possibly will reduce some overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)