You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Maja Kabiljo (JIRA)" <ji...@apache.org> on 2012/11/05 23:54:11 UTC

[jira] [Updated] (GIRAPH-404) More SendMessageCache improvements

     [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo updated GIRAPH-404:
--------------------------------

    Attachment: GIRAPH-404.patch

Here are the results on PageRankBenchmark:
10m vertices, 100 edges per vertex, 10 workers
1 thread: Total superstep time: 54s -> 35s
20m vertices, 100 edges per vertex, 12 workers
4 threads: Computation time: 26s -> 17s

Also tested on one of our real applications, speedup was a bit smaller, about 20-25%.

Here I assume that partition ids are consecutive numbers, or at least very close to that, otherwise this is not going to work well. I don't think that's required by giraph right now, but I don't see a reason why it wouldn't be. What do you think? If there is a reason not to require it, we can keep two implementations of SendMessageCache.
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira