You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Apache Arrow JIRA Bot (Jira)" <ji...@apache.org> on 2022/10/07 17:52:00 UTC

[jira] [Assigned] (ARROW-17023) [C++] Add initial Acero design documents

     [ https://issues.apache.org/jira/browse/ARROW-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Arrow JIRA Bot reassigned ARROW-17023:
---------------------------------------------

    Assignee:     (was: Weston Pace)

> [C++] Add initial Acero design documents
> ----------------------------------------
>
>                 Key: ARROW-17023
>                 URL: https://issues.apache.org/jira/browse/ARROW-17023
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> As Acero grows in complexity it will be difficult for new developers to be able to contribute meaningfully.  In addition, Acero should be open for extension by third party developers that wish to add new exec nodes.  These 3rd party developers will need to know details on how Acero schedules work and operates and will appreciate advice on efficient development.  At a minimum this first pass should explain:
>  * Threading / Scheduling model for Acero (note, there are proposals to enhance the model we currently have)
>  * Discussion of batch sizes and cache sizes and the morsel / batch model
>  * General discussion / advice for writing operators in a column-major way
>  * Design of current nodes, in particular, some more detail around how expression evaluation happens and how the hash-join node operates



--
This message was sent by Atlassian Jira
(v8.20.10#820010)