You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2012/11/20 11:20:58 UTC

[jira] [Commented] (GIRAPH-435) Serialize server messages for memory and less GC

    [ https://issues.apache.org/jira/browse/GIRAPH-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500965#comment-13500965 ] 

Avery Ching commented on GIRAPH-435:
------------------------------------

https://reviews.apache.org/r/8140/

This patch serializes messages on the server into byte[] and also reduces the amount of objects being created by copying a ByteArrayVertexIdMessages object to a MessageStore and using a representative message when doing computation of the vertices.

- Added new helper classes  ByteArrayIterable, ByteArrayIterator, RepresentativeByteArrayIterable, and RepresentativeByteArrayIterator to help create objects only when necessary
- Removed all collection methods from SimpleMessageStore and its children
- ByteArrayMessagesPerVertexStore stores messages on the server in serialized byte[] through ExtendedDataOutput (replaces CollectionOfMessagesPerVertexStore, which used objects)
- Iterators from ByteArrayVertexIdMessages are specialized to reduce the number of objects created (only when necessary)
- Allow the choice of message serialization with encoded message size (better for complex message objects) or just the message serialized (better for simple messages)

Results: 10 workers, 2.5 B edges, 10 m vertices, page rank

When not using the combiner, compute time on superstep 0 dropped from 12.05 -> 10.11, other supersteps improved accordingly as well.  Free memory after superstep 2 improved from 26727.61 M -> 53102.54 M.  GC time went down from 1,213.502 -> 518.597 (more than half).  Overall application time dropped from 353.085 -> 244.434 seconds.

When using the combiner, there are still some modest gains due to the reduction in objects created (35 -> 29 seconds for superstep 0).

Trunk (no combiner):

INFO    2012-11-16 17:57:35,036 [compute-23] org.apache.giraph.graph.ComputeCallable  - call: Computation took 12.054391 secs for 1 partitions on superstep 0.  Flushing started
INFO    2012-11-16 17:57:35,081 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 61302.61231994629M
INFO    2012-11-16 17:59:04,322 [compute-13] org.apache.giraph.graph.ComputeCallable  - call: Computation took 12.514889 secs for 1 partitions on superstep 1.  Flushing started
INFO    2012-11-16 17:59:04,369 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 34318.67655181885M
INFO    2012-11-16 18:00:29,559 [compute-10] org.apache.giraph.graph.ComputeCallable  - call: Computation took 11.869164 secs for 1 partitions on superstep 2.  Flushing started
INFO    2012-11-16 18:00:29,602 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 2 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 26727.610847473145M
INFO    2012-11-16 18:01:45,788 [compute-5] org.apache.giraph.graph.ComputeCallable  - call: Computation took 1.5657606 secs for 1 partitions on superstep 3.  Flushing started
INFO    2012-11-16 18:01:45,789 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 3 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 38338.96612548828M

Giraph Timers   Total (milliseconds)    353,085         0       353,085
Superstep 3 (milliseconds)              4,694           0       4,694
Setup (milliseconds)                    2,354           0       2,354
Vertex input superstep (milliseconds)   71,891          0       71,891
Shutdown (milliseconds)                 15,104          0       15,104
Superstep 0 (milliseconds)              85,516          0       85,516
Superstep 2 (milliseconds)              87,720          0       87,720
Superstep 1 (milliseconds)              85,803          0       85,803
Max heap size after GC in bytes         405,574,586,784         0       405,574,586,784
Total number of GC  1,823               0                       1,823
Total time of GC in milliseconds        1,213,502               0       1,213,502
Heap size after last GC in bytes        398,867,750,728         0       398,867,750,728

GIRAPH-435 (no combiner)

INFO    2012-11-19 01:31:02,712 [compute-0] org.apache.giraph.graph.ComputeCallable  - call: Computation took 10.109899 secs for 1 partitions on superstep 0.  Flushing started
INFO    2012-11-19 01:31:02,754 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 63432.56399536133M
INFO    2012-11-19 01:31:55,448 [compute-7] org.apache.giraph.graph.ComputeCallable  - call: Computation took 9.036797 secs for 1 partitions on superstep 1.  Flushing started
INFO    2012-11-19 01:31:55,491 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 59648.34732055664M
INFO    2012-11-19 01:32:52,595 [compute-9] org.apache.giraph.graph.ComputeCallable  - call: Computation took 8.786798 secs for 1 partitions on superstep 2.  Flushing started
INFO    2012-11-19 01:32:52,782 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 2 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 53102.54949951172M
INFO    2012-11-19 01:33:40,104 [compute-15] org.apache.giraph.graph.ComputeCallable  - call: Computation took 1.481184 secs for 1 partitions on superstep 3.  Flushing started
INFO    2012-11-19 01:33:40,105 [main] org.apache.giraph.graph.BspServiceWorker  - finishSuperstep: Waiting on all requests, superstep 3 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 65556.08256530762M
Total (milliseconds)            244,434                                          0                  244,434
Superstep 3 (milliseconds)      4,865                                            0                  4,865
oSetup (milliseconds)           2,210                                            0                  2,210
Vertex input superstep (milliseconds)                                            73,466             0   73,466
Shutdown (milliseconds)                                                          60                 0   60
Superstep 0 (milliseconds)                                                       51,097             0   51,097
Superstep 2 (milliseconds)                                                       55,029             0   55,029
Superstep 1 (milliseconds)                                                       57,704             0   57,704
Max heap size after GC in bytes                                                  241,391,753,216    0   241,391,753,216
Total number of GC  1,614                                                        0                  1,614
Total time of GC in milliseconds                                                 518,597            0   518,597
Heap size after last GC in bytes                                                 211,137,611,752    0   211,137,611,752

Trunk (with combiner):

Total (milliseconds)    190,247  0       190,247
Superstep 3 (milliseconds)       4,441   0      4,441
Setup (milliseconds)             2,761   0      2,761
Vertex input superstep (milliseconds)    73,147         0       73,147
Shutdown (milliseconds)                  359            0       359
Superstep 0 (milliseconds)               35,376         0       35,376
Superstep 2 (milliseconds)               35,896         0       35,896
Superstep 1 (milliseconds)               38,262         0       38,262
Max heap size after GC in bytes          182,641,373,552        0       182,641,373,552
Total number of GC  1,722                0                      1,722
Total time of GC in milliseconds         418,019                0       418,019
Heap size after last GC in bytes         167,224,437,896        0       167,224,437,896

GIRAPH-435 (with combiner):

Total (milliseconds)    175,022         0       175,022
Superstep 3 (milliseconds)              4,716   0       4,716
Setup (milliseconds)                    3,070   0       3,070
Vertex input superstep (milliseconds)   72,903  0       72,903
Shutdown (milliseconds)                 208     0       208
Superstep 0 (milliseconds)              29,271  0       29,271
Superstep 2 (milliseconds)              31,667  0       31,667
Superstep 1 (milliseconds)              33,183  0       33,183
Max heap size after GC in bytes         176,287,403,976         0       176,287,403,976
Total number of GC  1,493               0                       1,493
Total time of GC in milliseconds        354,427                 0       354,427
Heap size after last GC in bytes        154,542,562,320         0       154,542,562,320

Passed unittests, page rank benchmark on a real cluster, internal applications showed nice improvement as well.
                
> Serialize server messages for memory and less GC
> ------------------------------------------------
>
>                 Key: GIRAPH-435
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-435
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>         Attachments: GIRAPH-435.patch
>
>
> Keeping all messages as live objects on the server has a lot of GC overhead and also takes up a lot of memory.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira