You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Claudio Martella (JIRA)" <ji...@apache.org> on 2013/04/10 19:24:16 UTC

[jira] [Commented] (GIRAPH-616) Decouple vertices and edges in DiskBackedPartitionStore and avoid writing back edges when the algorithm does not change topology.

    [ https://issues.apache.org/jira/browse/GIRAPH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628006#comment-13628006 ] 

Claudio Martella commented on GIRAPH-616:
-----------------------------------------

We get the expected improvement, but there is a lot of variability in these benchmarks. It is difficult to assess exactly how much we win. But we don't introduce regression for sure.

PageRankBenchmark, 1M vertices, 100 edges each

trunk in-memory:

13/04/10 19:11:36 INFO mapred.JobClient:     Total (milliseconds)=65181
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 3 (milliseconds)=4383
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 4 (milliseconds)=4007
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 10 (milliseconds)=476
13/04/10 19:11:36 INFO mapred.JobClient:     Setup (milliseconds)=9780
13/04/10 19:11:36 INFO mapred.JobClient:     Shutdown (milliseconds)=83
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 7 (milliseconds)=3828
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 9 (milliseconds)=3707
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 0 (milliseconds)=10283
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 8 (milliseconds)=3374
13/04/10 19:11:36 INFO mapred.JobClient:     Input superstep (milliseconds)=7695
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 6 (milliseconds)=4275
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 5 (milliseconds)=4119
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 2 (milliseconds)=3944
13/04/10 19:11:36 INFO mapred.JobClient:     Superstep 1 (milliseconds)=5224

GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = false:

13/04/10 19:06:08 INFO mapred.JobClient:     Total (milliseconds)=131783
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 3 (milliseconds)=11603
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 4 (milliseconds)=8596
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 10 (milliseconds)=10196
13/04/10 19:06:08 INFO mapred.JobClient:     Setup (milliseconds)=7960
13/04/10 19:06:08 INFO mapred.JobClient:     Shutdown (milliseconds)=97
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 7 (milliseconds)=9283
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 9 (milliseconds)=6139
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 0 (milliseconds)=10843
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 8 (milliseconds)=7047
13/04/10 19:06:08 INFO mapred.JobClient:     Input superstep (milliseconds)=25272
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 6 (milliseconds)=6272
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 5 (milliseconds)=13822
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 2 (milliseconds)=5456
13/04/10 19:06:08 INFO mapred.JobClient:     Superstep 1 (milliseconds)=9193

GIRAPH-616 OOC graph (maxPartitions=2), isStaticGraph = true:

13/04/10 19:14:03 INFO mapred.JobClient:     Total (milliseconds)=82618
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 3 (milliseconds)=5542
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 4 (milliseconds)=6629
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 10 (milliseconds)=3500
13/04/10 19:14:03 INFO mapred.JobClient:     Setup (milliseconds)=8515
13/04/10 19:14:03 INFO mapred.JobClient:     Shutdown (milliseconds)=80
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 7 (milliseconds)=5353
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 9 (milliseconds)=5251
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 0 (milliseconds)=4734
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 8 (milliseconds)=5496
13/04/10 19:14:03 INFO mapred.JobClient:     Input superstep (milliseconds)=6628
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 6 (milliseconds)=9124
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 5 (milliseconds)=5817
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 2 (milliseconds)=5504
13/04/10 19:14:03 INFO mapred.JobClient:     Superstep 1 (milliseconds)=10438
                
> Decouple vertices and edges in DiskBackedPartitionStore and avoid writing back edges when the algorithm does not change topology.
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-616
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-616
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Claudio Martella
>            Assignee: Claudio Martella
>         Attachments: GIRAPH-616.diff
>
>
> Many algorithms work on a static graph. In these cases, when running out-of-core graph we end up writing back the edges that have not changed since we read them. By decoupling vertices and edges, we can write back only the freshly computed vertex values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira