You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Eli Reisman (JIRA)" <ji...@apache.org> on 2012/11/11 21:53:12 UTC
[jira] [Commented] (GIRAPH-417) Serialize the graph/message cache
into byte[] for improving memory usage and compute speed
[ https://issues.apache.org/jira/browse/GIRAPH-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494989#comment-13494989 ]
Eli Reisman commented on GIRAPH-417:
------------------------------------
I wanted to do this while at LI. I had a suspicion something like this could be the answer to a lot of blockers for in-memory use cases. I will definitely read this patch over on rb and have more to say. Nice work!
> Serialize the graph/message cache into byte[] for improving memory usage and compute speed
> ------------------------------------------------------------------------------------------
>
> Key: GIRAPH-417
> URL: https://issues.apache.org/jira/browse/GIRAPH-417
> Project: Giraph
> Issue Type: Improvement
> Reporter: Avery Ching
> Assignee: Avery Ching
> Attachments: GIRAPH-417.patch
>
>
> Our entire graph is currently stored as Java objects in memory. I added an option to keep only a representative vertex that serializes/deserializes on the fly and should be used with the new ByteArrayPartition. In conjunction with a serialized client-side message cache, memory usage then loading shrinks to almost 1/10 of trunk and loads the input splits almost 3x faster (see input superstep times below). I added a serializer based on Sun's unsafe methods that enables this memory savings with a very small performance hit (maybe a few 1-5% slower). Compared to trunk, when serializing the messages with our faster serializer, compute time improves significantly as well against trunk (16.7 -> 12.31 for 2.5B edges, 2.97 -> 1.61 for 250M edges). There are still further improvements to be made on the server side where we still store our messages in-memory. I (or someone else) can do that in a later patch. This also significantly reduces GC time, as there are less objects to GC.
> - Improves byte[] serialization signficantly
> -- Added ExtendedDataInput/ExtendedDataOutput interfaces to allow for some additional methods needed for byte[] serialization/deserialization
> -- Add ExtendedByteArrayDataInput/ExtendedByteArrayDataoutput to serialize/deserialize Writables to a byte[]
> -- Added DynamicChannelBufferOutputStream/DynamicChannelBufferInputStream to serialize/deserialize Writables to a DynamicChannelBuffer
> - Gives you the choice of partition implementation (SimplePartition (default) or ByteArrayPartition -> (serialized vertices))
> -- Added a new method to Partition called saveVertex(), which also the serialization back into the ByteArrayPartition or does nothing when using SimplePartition
> - Gives you the choice of unsafe serialization (using Sun's unsafe class - default) or regular serialization
> - Serializes the messages on the client cache into byte[] (saves memory and also serializes faster)
> -- Created new ByteArrayVertexIdMessageCollection to support the serialized messages
> -- SendVertexRequest now sends Partition objects rather than collections
> - Adds 2 more options in PageRankBenchmark to try out RepresentationVertex or RepresentationVertex with unsafe serialization
> - Fixed a bug in LongDoubleFloatDoubleVertex's readFields when edges aren't cleared before deserializing
> - Added new unittests
> -- Replaced TestEdgeListVertex with TestMutableVertex to test all our generic MutableVertex implementations
> --- Added more serialization tests of different serialization
> -- TestPartitionStores has more testing of unsafe serialization/deserialization
> Testing:
> All unittests pass
> Distributed unittests pass - (except two that also fail in trunk)
> Lots of PageRankBenchmark runs on a cluster
> Benchmark results:
> 25 edges / vertex, 10M vertices, 10 workers
> Trunk
> INFO 2012-11-08 14:43:55,855 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 22.475897 secs, (v=1000000, e=25000000) 44492.105 vertices/sec, 1112302.6 edges/sec
> INFO 2012-11-08 14:44:00,411 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 76580.54187774658M
> INFO 2012-11-08 14:44:05,254 [compute-7] org.apache.giraph.graph.ComputeCallable - call: Computation took 2.9732208 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 14:44:11,180 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Superstep 0, messages = 25000000 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 74781.9575881958M
> Total (milliseconds) 62,413 0 62,413
> Superstep 3 (milliseconds) 2,417 0 2,417
> Setup (milliseconds) 2,731 0 2,731
> Shutdown (milliseconds) 50 0 50
> Superstep 0 (milliseconds) 10,654 0 10,654
> Input superstep (milliseconds) 27,484 0 27,484
> Superstep 2 (milliseconds) 9,475 0 9,475
> Superstep 1 (milliseconds) 9,599 0 9,599
> Total time of GC in milliseconds 225,052 0 225,052
> 25 edges / vertex, 10M vertices, 10 workers
> SimplePartition + EdgeListVertex (after rebase)
> INFO 2012-11-08 14:33:15,907 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 25.431986 secs, (v=1000000, e=25000000) 39320.562 vertices/sec, 983014.06 edges/sec
> INFO 2012-11-08 14:33:17,501 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 76290.28507995605M
> INFO 2012-11-08 14:33:20,175 [compute-2] org.apache.giraph.graph.ComputeCallable - call: Computation took 2.0086238 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 14:33:26,667 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Superstep 0, messages = 25000000 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 73716.20901489258M
> Trunk (after rebase)
> Total (milliseconds) 68,113 0 68,113
> Superstep 3 (milliseconds) 2,057 0 2,057
> Setup (milliseconds) 9,765 0 9,765
> Shutdown (milliseconds) 59 0 59
> Superstep 0 (milliseconds) 9,180 0 9,180
> Input superstep (milliseconds) 27,525 0 27,525
> Superstep 2 (milliseconds) 9,600 0 9,600
> Superstep 1 (milliseconds) 9,924 0 9,924
> Total time of GC in milliseconds 216,345 0 216,345
> 250 edges / vertex, 10M vertices, 10 workers
> ByteArrayPartition + UnsafeRepresentativeVertex + reuse vertexdata buffer + unsafe serialization (after rebase)
> INFO 2012-11-08 14:33:09,822 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 9.3217535 secs, (v=1000000, e=25000000) 107275.95 vertices/sec, 2681898.8 edges/sec
> INFO 2012-11-08 14:33:10,900 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 79974.63636779785M
> INFO 2012-11-08 14:33:13,213 [compute-7] org.apache.giraph.graph.ComputeCallable - call: Computation took 1.6110481 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 14:33:13,972 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 78228.54064941406M
> Total (milliseconds) 47,061 0 47,061
> Superstep 3 (milliseconds) 2,175 0 2,175
> Setup (milliseconds) 3,018 0 3,018
> Shutdown (milliseconds) 1,050 0 1,050
> Superstep 0 (milliseconds) 8,780 0 8,780
> Input superstep (milliseconds) 10,952 0 10,952
> Superstep 2 (milliseconds) 10,450 0 10,450
> Superstep 1 (milliseconds) 10,633 0 10,633
> 250 edges / vertex, 10M vertices, 10 workers
> Trunk
> INFO 2012-11-08 14:46:25,304 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 167.02779 secs, (v=1000000, e=250000000) 5987.028 vertices/sec, 1496757.0 edges/sec
> INFO 2012-11-08 14:46:35,558 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 38447.11888885498M
> INFO 2012-11-08 14:46:52,963 [compute-14] org.apache.giraph.graph.ComputeCallable - call: Computation took 16.770031 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 14:46:53,074 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 24629.869369506836M
> Total (milliseconds) 568,094 0 568,094
> Superstep 3 (milliseconds) 2,344 0 2,344
> Setup (milliseconds) 2,748 0 2,748
> Shutdown (milliseconds) 47 0 47
> Superstep 0 (milliseconds) 67,853 0 67,853
> Input superstep (milliseconds) 177,722 0 177,722
> Superstep 2 (milliseconds) 247,518 0 247,518
> Superstep 1 (milliseconds) 69,856 0 69,856
> Total time of GC in milliseconds 2,741,892 0 2,741,892
> 250 edges / vertex, 10M vertices, 10 workers
> SimplePartition + EdgeListVertex (after rebase)
> INFO 2012-11-08 14:19:57,774 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 172.17258 secs, (v=1000000, e=250000000) 5808.126 vertices/sec, 1452031.5 edges/sec
> INFO 2012-11-08 14:20:04,864 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 37025.9013671875M
> INFO 2012-11-08 14:20:17,453 [compute-6] org.apache.giraph.graph.ComputeCallable - call: Computation took 11.959192 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 14:20:17,606 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 21953.103630065918M
> Total (milliseconds) 470,845 0 470,845
> Superstep 3 (milliseconds) 2,595 0 2,595
> Setup (milliseconds) 1,774 0 1,774
> Shutdown (milliseconds) 54 0 54
> Superstep 0 (milliseconds) 59,609 0 59,609
> Input superstep (milliseconds) 179,665 0 179,665
> Superstep 2 (milliseconds) 165,848 0 165,848
> Superstep 1 (milliseconds) 61,296 0 61,296
> Total time of GC in milliseconds 2,480,260 0 2,480,260
> 250 edges / vertex, 10M vertices, 10 workers
> ByteArrayPartition + UnsafeRepresentativeVertex + reuse vertexdata buffer + unsafe serialization (after rebase)
> INFO 2012-11-08 13:26:50,334 [load-0] org.apache.giraph.graph.InputSplitsCallable - call: Loaded 1 input splits in 69.22095 secs, (v=1000000, e=250000000) 14446.494 vertices/sec, 3611623.5 edges/sec
> INFO 2012-11-08 13:26:52,511 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep -1 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 75393.74648284912M
> INFO 2012-11-08 13:27:06,441 [compute-5] org.apache.giraph.graph.ComputeCallable - call: Computation took 12.318953 secs for 1 partitions on superstep 0. Flushing started
> INFO 2012-11-08 13:27:06,483 [main] org.apache.giraph.graph.BspServiceWorker - finishSuperstep: Waiting on all requests, superstep 0 totalMem = 81728.6875M, maxMem = 81728.6875M, freeMem = 62303.2106552124M
> Total (milliseconds) 301,720 0 301,720
> Superstep 3 (milliseconds) 4,759 0 4,759
> Setup (milliseconds) 2,887 0 2,887
> Shutdown (milliseconds) 50 0 50
> Superstep 0 (milliseconds) 72,625 0 72,625
> Input superstep (milliseconds) 75,797 0 75,797
> Superstep 2 (milliseconds) 72,245 0 72,245
> Superstep 1 (milliseconds) 73,353 0 73,353
> Total time of GC in milliseconds 716,930 0 716,930
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira