You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Suraj Menon (JIRA)" <ji...@apache.org> on 2012/05/03 14:30:49 UTC

[jira] [Updated] (HAMA-567) BSPPeer should provide means for chaining supersteps to share data among them.

     [ https://issues.apache.org/jira/browse/HAMA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suraj Menon updated HAMA-567:
-----------------------------

    Attachment: Mapper.java

Hi, Please check the simplest Mapper that I have written, it is a work in progress and not tested at all. The WritableKeyValues class is WritableComparable on the key. The idea is that every mapper would read and exchange the key distribution of each peer among themselves while writing everything to a diskqueue. I am working on Spilling Queue with combiner. So in the first step all the mapper superstep understands the global key distribution and assigns each peer the responsibility for partition of keys such that there is a minimum of messages exchaged. The message exchange happens in the next superstep. Hence I need to provide a reference to the message queue in the next superstep. I also want to achieve parallelism by having a thread work on the combiners during the expensive sync operation. Also you can see how getting peer ID is ugly today, we need a new API to find peer id from the task id provided. All this made me feel the necessity for the API changes.
                
> BSPPeer should provide means for chaining supersteps to share data among them.
> ------------------------------------------------------------------------------
>
>                 Key: HAMA-567
>                 URL: https://issues.apache.org/jira/browse/HAMA-567
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp core
>    Affects Versions: 0.6.0
>            Reporter: Suraj Menon
>             Fix For: 0.6.0
>
>         Attachments: Mapper.java
>
>
> In most scenarios, a superstep would need certain values or objects that were computed in the previous superstep. When using the chaining Superstep design to implement BSP algorithms, this gets a little ugly/difficult to implement. BSPPeer should provide means (preferably a map<String,Object>) so that the next Superstep can ask for the values in previous superstep using String token to query the map. Also, this map could be checkpointed periodically in the background so that we can completely recover the state of a task after failure. The BSPPeer object should have a dedicated get and set function for updating values in the peer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira