You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2011/09/07 03:03:09 UTC

[jira] [Commented] (GIRAPH-12) Investigate communication improvements

    [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098512#comment-13098512 ] 

Avery Ching commented on GIRAPH-12:
-----------------------------------

Jake from Twitter also recommended thinking about using Finagle.  His description:

"A fault tolerant, protocol-agnostic RPC system" based on Netty [which I see is already under consideration], written in scala, but with very mature java bindings too).  We use it internally at Twitter for clusters of mid-tier servers which have many dozens of machines talking to hundreds of other machines, without blowing up on thread-stack or using a gazillion threads.  It's mavenized, so it's easy to try out.

> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>
> Currently every worker will start up a thread to communicate with every other workers.  Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to improve this situation.  By moving away from Hadoop RPC, we would also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira