You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ruan Pearce-Authers (Jira)" <ji...@apache.org> on 2020/12/13 16:46:00 UTC

[jira] [Commented] (ARROW-9828) [Rust] [DataFusion] TableProvider trait should support predicate push-down

    [ https://issues.apache.org/jira/browse/ARROW-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248615#comment-17248615 ] 

Ruan Pearce-Authers commented on ARROW-9828:
--------------------------------------------

I'm currently taking a look at a workable implementation of this. So far, my design is:
 * TableProvider gets a new method to test whether a given filter expr can be pushed down to the provider implementation. Possible results are:
 ** No - the filter is not pushed down.
 ** Yes, exact - the filter is added to the provider scan, and the provider can guarantee that no result rows fail the filter predicate, so the original Filter node is removed
 ** Yes, inexact - the filter is added to the provider scan, but the provider has only used this as a guideline to minimise retrieved data (i.e. some tuples in the result set may still not meet the filter criteria), so the original Filter node is preserved
 * LogicalPlan::TableScan gets an optional vec of filter expressions
 * The filter pushdown pass is updated to use this new TableProvider method when processing TableScan nodes, removing filters with exact support from the plan entirely and rewriting the TableScan nodes to include both exact and inexact filters for later creation of execution plans

If anyone stumbles across this and has any massive concerns with this implementation, or suggestions for improvements, let me know! Otherwise, I'll put up a PR once I have something functional.

> [Rust] [DataFusion] TableProvider trait should support predicate push-down
> --------------------------------------------------------------------------
>
>                 Key: ARROW-9828
>                 URL: https://issues.apache.org/jira/browse/ARROW-9828
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust, Rust - DataFusion
>            Reporter: Andy Grove
>            Priority: Major
>
> TableProvider trait should support predicate push-down, so that predicates can applied to custom storage implementations.
> I suggest looking at Apache Spark's org.apache.spark.sql.connector.read.SupportsPushDownFilters trait for inspiration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)