You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Hyunsik Choi (JIRA)" <ji...@apache.org> on 2011/09/11 19:06:09 UTC

[jira] [Updated] (GIRAPH-12) Investigate communication improvements

     [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi updated GIRAPH-12:
-------------------------------

    Attachment: GIRAPH-12_1.patch

As Avery mentioned, in the current architecture, each worker requires N threads that communicate with N remote peers. This may incur severe context-switching overheads (especially when all messages are flushed) and more memory consumption. Firstly, I considered about replacing RPC system to another one. However, it is not simple work. I need more time.

Instead, I have considered an alternative way to employ ThreadPoolExecutor in order to adjust active threads. When Giraph deals with large graphs, the performance of Giraph is usually bounded on network bandwidth. I think that this approach would be effective. In addition, I tried to reduce the synchronization area, where BasicRPCCommunicator (374-394 lines) sends large buffered messages to specific peers.

I attached the patch in progress. Now, I cannot access to real hadoop cluster for one week. I didn't test this in real cluster. Besides, all unit test are passed.

How about this approach? Could you review this?

> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch
>
>
> Currently every worker will start up a thread to communicate with every other workers.  Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to improve this situation.  By moving away from Hadoop RPC, we would also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira