You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Maja Kabiljo <ma...@fb.com> on 2013/08/30 05:19:48 UTC

Review Request 13909: GIRAPH-752: Better support for supernodes

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/
-----------------------------------------------------------

Review request for giraph.


Bugs: GIRAPH-752
    https://issues.apache.org/jira/browse/GIRAPH-752


Repository: giraph-git


Description
-------

We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.


Diffs
-----

  giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
  giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
  giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 

Diff: https://reviews.apache.org/r/13909/diff/


Testing
-------

Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
mvn clean verify


Thanks,

Maja Kabiljo


Re: Review Request 13909: GIRAPH-752: Better support for supernodes

Posted by Maja Kabiljo <ma...@fb.com>.

> On Aug. 30, 2013, 11:33 p.m., Avery Ching wrote:
> > +1, awesome work that will help our super nodes.

Thanks for a quick review, Avery!


> On Aug. 30, 2013, 11:33 p.m., Avery Ching wrote:
> > giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java, line 45
> > <https://reviews.apache.org/r/13909/diff/1/?file=346572#file346572line45>
> >
> >     Should this be bigger than 32 MB?  If we are hitting the 2 GB barrier, then we will have 64 buffers just to get to 2 GB.  Maybe 64 MB?  Would this help reduce the overhead?

I don't believe that having that few buffers comparing to their size can add any visible overhead. I think that the overhead comes because we have to do the checks all the time. With one application which is using a lot of memory I tried 256MB chunks and it was crashing, while 32MB run fine. 


- Maja


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/#review25811
-----------------------------------------------------------


On Sept. 2, 2013, 6:03 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13909/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2013, 6:03 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-752
>     https://issues.apache.org/jira/browse/GIRAPH-752
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
> We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
>   giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13909/diff/
> 
> 
> Testing
> -------
> 
> Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
> Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
> mvn clean verify
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>


Re: Review Request 13909: GIRAPH-752: Better support for supernodes

Posted by Avery Ching <av...@gmail.com>.

> On Aug. 30, 2013, 11:33 p.m., Avery Ching wrote:
> > giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java, line 45
> > <https://reviews.apache.org/r/13909/diff/1/?file=346572#file346572line45>
> >
> >     Should this be bigger than 32 MB?  If we are hitting the 2 GB barrier, then we will have 64 buffers just to get to 2 GB.  Maybe 64 MB?  Would this help reduce the overhead?
> 
> Maja Kabiljo wrote:
>     I don't believe that having that few buffers comparing to their size can add any visible overhead. I think that the overhead comes because we have to do the checks all the time. With one application which is using a lot of memory I tried 256MB chunks and it was crashing, while 32MB run fine.

Sounds good.


- Avery


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/#review25811
-----------------------------------------------------------


On Sept. 2, 2013, 6:03 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13909/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2013, 6:03 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-752
>     https://issues.apache.org/jira/browse/GIRAPH-752
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
> We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
>   giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13909/diff/
> 
> 
> Testing
> -------
> 
> Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
> Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
> mvn clean verify
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>


Re: Review Request 13909: GIRAPH-752: Better support for supernodes

Posted by Avery Ching <av...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/#review25811
-----------------------------------------------------------

Ship it!


+1, awesome work that will help our super nodes.


giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
<https://reviews.apache.org/r/13909/#comment50345>

    "goes over allowed size of an array" ->
    "goes beyond the maximum size of a byte array"
    setting this option will remove that limit.  The maximum memory available for a single vertex will be limited to the maximum heap size available.



giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
<https://reviews.apache.org/r/13909/#comment50343>

    "of an array" -> "of a byte array"



giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java
<https://reviews.apache.org/r/13909/#comment50346>

    If you provide a method to get the number of buffers, you can allocate the array to the exact size.



giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java
<https://reviews.apache.org/r/13909/#comment50347>

    This is an assumption right?  If skipBytes returns something other than bytesLeftToSkip it would be wrong.



giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java
<https://reviews.apache.org/r/13909/#comment50348>

    2^31 bytes right?



giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java
<https://reviews.apache.org/r/13909/#comment50344>

    Should this be bigger than 32 MB?  If we are hitting the 2 GB barrier, then we will have 64 buffers just to get to 2 GB.  Maybe 64 MB?  Would this help reduce the overhead?


- Avery Ching


On Aug. 30, 2013, 3:19 a.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13909/
> -----------------------------------------------------------
> 
> (Updated Aug. 30, 2013, 3:19 a.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-752
>     https://issues.apache.org/jira/browse/GIRAPH-752
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
> We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
>   giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13909/diff/
> 
> 
> Testing
> -------
> 
> Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
> Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
> mvn clean verify
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>


Re: Review Request 13909: GIRAPH-752: Better support for supernodes

Posted by Avery Ching <av...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/#review25854
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java
<https://reviews.apache.org/r/13909/#comment50418>

    This should be 2GB right?


- Avery Ching


On Sept. 2, 2013, 6:03 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13909/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2013, 6:03 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-752
>     https://issues.apache.org/jira/browse/GIRAPH-752
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
> We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
>   giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
>   giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
>   giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
>   giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13909/diff/
> 
> 
> Testing
> -------
> 
> Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
> Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
> mvn clean verify
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>


Re: Review Request 13909: GIRAPH-752: Better support for supernodes

Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13909/
-----------------------------------------------------------

(Updated Sept. 2, 2013, 6:03 p.m.)


Review request for giraph.


Changes
-------

Avery's comments


Bugs: GIRAPH-752
    https://issues.apache.org/jira/browse/GIRAPH-752


Repository: giraph-git


Description
-------

We've seen before that we crash when we have a vertex which receives a lot of messages and we don't use a combiner. That is because the total size of serialized messages for that vertex is bigger than the allowed size of an array.
We should implement OutputStream which can handle arbitrary size of data and add an option to use that kind of stream for messages.


Diffs (updated)
-----

  giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 6518da6 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterable.java a466a8d 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/PartitionDiskBackedMessageStore.java 7b3e548 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/SequentialFileMessageStore.java 64031c3 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/IntByteArrayMessageStore.java 597e7af 
  giraph-core/src/main/java/org/apache/giraph/comm/messages/primitives/LongByteArrayMessageStore.java 3fe6356 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 604729a 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 2506c21 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterable.java cf2c187 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayIterator.java 76ed789 
  giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java 56cc01c 
  giraph-core/src/main/java/org/apache/giraph/utils/Factory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterable.java e3992ed 
  giraph-core/src/main/java/org/apache/giraph/utils/RepresentativeByteArrayIterator.java b6151c5 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/BigDataOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/DataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/io/package-info.java PRE-CREATION 

Diff: https://reviews.apache.org/r/13909/diff/


Testing
-------

Run a job which fails with original code and when the new option is not used, and verified it works properly when the job is used. 
Also compared the performance with and without the change, it's the same, when option is turned on it seems to add about 5% overhead.
mvn clean verify


Thanks,

Maja Kabiljo