You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Micah Whitacre (JIRA)" <ji...@apache.org> on 2017/03/17 01:57:41 UTC

[jira] [Commented] (CRUNCH-639) Writable Bytes does an unnecessary copy

    [ https://issues.apache.org/jira/browse/CRUNCH-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929315#comment-15929315 ] 

Micah Whitacre commented on CRUNCH-639:
---------------------------------------

So the proposal is not functionally correct.  Specifically, ByteBuffer.array() returns the backing storage byte[].  The relevant bytes could only occupy a portion of the the array.  Here's a good write up[1] to explain the difference and why coping is necessary and specifically why using input.array() is incorrect.

[1] - https://worldmodscode.wordpress.com/2012/12/14/the-java-bytebuffer-a-crash-course/

> Writable Bytes does an unnecessary copy
> ---------------------------------------
>
>                 Key: CRUNCH-639
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-639
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stephen Patel
>            Assignee: Josh Wills
>            Priority: Minor
>
> In the Writable.bytes() Output MapFn, an unnecessary (I believe) copy of the incoming ByteBuffer occurs[0].
> Current:
> {code}
> BytesWritable bw = new BytesWritable();
> bw.set(input.array(), input.arrayOffset(), input.limit()); <- copies the array
> {code}
> Proposed:
> {code}
> BytesWritable bw = new BytesWritable(input.array()); 
> {code}
> [0]: https://github.com/apache/crunch/blob/apache-crunch-0.15.0/crunch-core/src/main/java/org/apache/crunch/types/writable/Writables.java#L271



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)