You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jorge Leitão (Jira)" <ji...@apache.org> on 2020/11/21 22:37:00 UTC

[jira] [Commented] (ARROW-10683) [Rust] Remove Array.data method in favor of .data_ref to make performance impact of clone more obvious

    [ https://issues.apache.org/jira/browse/ARROW-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236805#comment-17236805 ] 

Jorge Leitão commented on ARROW-10683:
--------------------------------------

What do you think of instead of removing `data()`, change its signature to `&ArrayDataRef`, and remove `data_ref` instead?

> [Rust] Remove Array.data method in favor of .data_ref to make performance impact of clone more obvious
> ------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-10683
>                 URL: https://issues.apache.org/jira/browse/ARROW-10683
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Jörn Horstmann
>            Priority: Major
>
> The `Array.data()` method is a real performance foot-gun since it involves cloning an `Arc` which is not obvious to users. When used in innner loops that can cause big performance impacts. The cloning itself might not be a problem, but I think it sometimes prohibits other compiler optimizations.
> It would be better to remove this method and let users call `array_ref()` and only clone when really needed.
> Most of the current usages seem to be in test assertions which should be easy to refactor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)