You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jorge (Jira)" <ji...@apache.org> on 2020/09/25 19:42:00 UTC

[jira] [Commented] (ARROW-9753) [Rust] [DataFusion] Remove the use of Mutex in ExecutionPlan trait

    [ https://issues.apache.org/jira/browse/ARROW-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202396#comment-17202396 ] 

Jorge commented on ARROW-9753:
------------------------------

Isn't it possible to replace {{Arc<Mutex<dyn RecordBatchReader>>}} by {{Box<dyn RecordBatchReader>}}?  Maybe this is not a good idea for other reasons (e.g. we can't share  on batches), but reading the current code, the creation of RecordBatchIterator is always done inside the thread, and what needs to be Arc + Send+Sync is the ExecutionPlan itself, that crosses thread spawn boundaries (see e.g. MergeExec::execute).

I did a quick POC locally, and I was able to compile and have the tests run with the change above.

To run threads on an iterator, I think that we need scoped threads (a-la crossbeam) or some mechanism to create the threads inside the iteration (which IMO needs a scheduler).

This SO question is quite good in this respect: [https://stackoverflow.com/a/45327907/931303]

> [Rust] [DataFusion] Remove the use of Mutex in ExecutionPlan trait
> ------------------------------------------------------------------
>
>                 Key: ARROW-9753
>                 URL: https://issues.apache.org/jira/browse/ARROW-9753
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust, Rust - DataFusion
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The ExecutionPlan trait should not return Arc<Mutex<RecordBatchIterator>> but just Arc<RecordBatchIterator> since most operators do not need to be mutable. Those that do can use interior mutability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)