You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jorge Leitão (Jira)" <ji...@apache.org> on 2020/11/21 22:37:00 UTC
[jira] [Commented] (ARROW-10683) [Rust] Remove Array.data method in
favor of .data_ref to make performance impact of clone more obvious
[ https://issues.apache.org/jira/browse/ARROW-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236805#comment-17236805 ]
Jorge Leitão commented on ARROW-10683:
--------------------------------------
What do you think of instead of removing `data()`, change its signature to `&ArrayDataRef`, and remove `data_ref` instead?
> [Rust] Remove Array.data method in favor of .data_ref to make performance impact of clone more obvious
> ------------------------------------------------------------------------------------------------------
>
> Key: ARROW-10683
> URL: https://issues.apache.org/jira/browse/ARROW-10683
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Jörn Horstmann
> Priority: Major
>
> The `Array.data()` method is a real performance foot-gun since it involves cloning an `Arc` which is not obvious to users. When used in innner loops that can cause big performance impacts. The cloning itself might not be a problem, but I think it sometimes prohibits other compiler optimizations.
> It would be better to remove this method and let users call `array_ref()` and only clone when really needed.
> Most of the current usages seem to be in test assertions which should be easy to refactor.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)