You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Maja Kabiljo (JIRA)" <ji...@apache.org> on 2012/11/05 23:52:13 UTC

[jira] [Created] (GIRAPH-404) More SendMessageCache improvements

Maja Kabiljo created GIRAPH-404:
-----------------------------------

             Summary: More SendMessageCache improvements
                 Key: GIRAPH-404
                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
             Project: Giraph
          Issue Type: Improvement
            Reporter: Maja Kabiljo
            Assignee: Maja Kabiljo


Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-404) More SendMessageCache improvements

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo updated GIRAPH-404:
--------------------------------

    Attachment: GIRAPH-404.patch

Here are the results on PageRankBenchmark:
10m vertices, 100 edges per vertex, 10 workers
1 thread: Total superstep time: 54s -> 35s
20m vertices, 100 edges per vertex, 12 workers
4 threads: Computation time: 26s -> 17s

Also tested on one of our real applications, speedup was a bit smaller, about 20-25%.

Here I assume that partition ids are consecutive numbers, or at least very close to that, otherwise this is not going to work well. I don't think that's required by giraph right now, but I don't see a reason why it wouldn't be. What do you think? If there is a reason not to require it, we can keep two implementations of SendMessageCache.
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-404) More SendMessageCache improvements

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491014#comment-13491014 ] 

Maja Kabiljo commented on GIRAPH-404:
-------------------------------------

https://reviews.apache.org/r/7883/
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (GIRAPH-404) More SendMessageCache improvements

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo resolved GIRAPH-404.
---------------------------------

    Resolution: Fixed
    
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-404) More SendMessageCache improvements

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491017#comment-13491017 ] 

Avery Ching commented on GIRAPH-404:
------------------------------------

nm, I'm too slow. =)
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-404) More SendMessageCache improvements

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491010#comment-13491010 ] 

Maja Kabiljo commented on GIRAPH-404:
-------------------------------------

"not going to work well" means we are always going to have an array of length maxPartitionId (it's still going to work correctly, we'll just have an array with lots of empty space).
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-404) More SendMessageCache improvements

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491645#comment-13491645 ] 

Hudson commented on GIRAPH-404:
-------------------------------

Integrated in Giraph-trunk-Commit #275 (See [https://builds.apache.org/job/Giraph-trunk-Commit/275/])
    GIRAPH-404: More SendMessageCache improvements (Revision 1406239)

     Result = SUCCESS
maja : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406239
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendMessageCache.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessageCollection.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStoreByPartition.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/SimpleMessageStore.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/requests/SendWorkerMessagesRequest.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/PairList.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/PairListWritable.java
* /giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestFailureTest.java
* /giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestTest.java

                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-404) More SendMessageCache improvements

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491016#comment-13491016 ] 

Avery Ching commented on GIRAPH-404:
------------------------------------

Partition ids are simply a unique way to identify a partition.  I can't think of any reason why one would have non consecutive partition ids.  Given the space savings, this is a big win.  One thing we can do is to check that the partition ids are consecutive.  This is a great performance improvement.  Can you please also post a reviewboard diff Maja?
                
> More SendMessageCache improvements
> ----------------------------------
>
>                 Key: GIRAPH-404
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-404
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-404.patch
>
>
> Having a lot of maps in SendMessageCache still makes it slow, so here is another step towards making it faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira