You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/12/10 23:03:00 UTC

[jira] [Updated] (ARROW-10322) [C++][Dataset] Minimize Expression to a wrapper around compute::Function

     [ https://issues.apache.org/jira/browse/ARROW-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-10322:
-----------------------------------
    Labels: pull-request-available  (was: )

> [C++][Dataset] Minimize Expression to a wrapper around compute::Function
> ------------------------------------------------------------------------
>
>                 Key: ARROW-10322
>                 URL: https://issues.apache.org/jira/browse/ARROW-10322
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>    Affects Versions: 1.0.1
>            Reporter: Ben Kietzman
>            Assignee: Ben Kietzman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Expression class hierarchy was originally intended to provide generic, structured representations of compute functionality. On the former point they have been superseded by compute::{Function, Kernel, ...} which encapsulates validation and execution. In light of this Expression can be drastically simplified and improved by composition with these classes. Each responsibility which can be deferred implies less boilerplate when exposing a new compute function for use in datasets. Ideally any compute function will be immediately available to use in a filter or projection.
> {code}
> struct Expression {
>   using Literal = std::shared_ptr<Scalar>;
>   struct Projection {
>     std::vector<std::string> names
>     std::vector<Expression> values;
>   };
>   struct Call {
>     std::shared_ptr<ScalarFunction> function;
>     std::shared_ptr<FunctionOptions> options;
>     std::vector<Expression> arguments;
>   };
>   util::variant<Literal, FieldRef, Projection, Call> value;
> };
> {code}
> A simple discriminated union as above should be sufficient to represent arbitrary filters and projections: any expression which results in type {{bool}} is a valid filter, and any expression which is a {{Projection}} may be used to map one record batch to another.
> Expression simplification (currently implemented in {{Expression::Assume}}) is an optimization used for example in predicate pushdown, and therefore need not exhaustively cover the full space of available compute functions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)