You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andrew Lamb (Jira)" <ji...@apache.org> on 2021/02/18 15:06:00 UTC

[jira] [Assigned] (ARROW-11689) [Rust][DataFusion] Reduce copies in DataFusion LogicalPlan and Expr creation

     [ https://issues.apache.org/jira/browse/ARROW-11689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Lamb reassigned ARROW-11689:
-----------------------------------

    Assignee: Andrew Lamb

> [Rust][DataFusion] Reduce copies in DataFusion LogicalPlan and Expr creation
> ----------------------------------------------------------------------------
>
>                 Key: ARROW-11689
>                 URL: https://issues.apache.org/jira/browse/ARROW-11689
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Andrew Lamb
>            Assignee: Andrew Lamb
>            Priority: Major
>
> The theme of this overall epic to make the plan and expression rewriting phases of DataFusion more efficient by avoiding copies by leveraging the Rust type system
> Benefits:
> * More standard / idomatic Rust usage
> * faster / more efficient (I don't have numbers to back this up)
> Downsides:
> * These will be  backwards incompatible changes
> h1. Background
> Many things in DataFusion  look like
> Input --tranformation-->output
> And the input is not used again. In rust, you can model this by giving ownership to the transformation
> At a high level the idea is to avoid so much cloning in DataFustion
> The basic principle is if the function needs to `clone` one of its arguments, the caller should be given the choice of when to do that. Often, the caller can give up ownership without issue
> I envision at least the following the following items:
> 1. Optimizer passes that take `&LogicalPlan` and produce a new `LogicalPlan` even though most callsites do not need the original
> 2. Expr builder calls that take `&expr` and return a new `Expr`
> 3. An expression rewriter (TODO) while running down optimizer passes
> I think this style takes advantage of Rust's ownership model and will let us avoid a lot o copying and allocations and avoid the need for something like slab allocators



--
This message was sent by Atlassian Jira
(v8.3.4#803005)