You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Thomas Jungblut (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2011/11/02 16:41:32 UTC

[jira] [Issue Comment Edited] (HAMA-258) Design a input and output system

    [ https://issues.apache.org/jira/browse/HAMA-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142213#comment-13142213 ] 

Thomas Jungblut edited comment on HAMA-258 at 11/2/11 3:40 PM:
---------------------------------------------------------------

This patch adds missing Apache headers in the files and moves the new I/O usage to the BSPPeerImpl, therefore changes the API in BSP class.

Test cases are working without any errors.

In my opinion we should move all the I/O related classes into another package. BSP package is soo bloated..

TODO we should add custom partitioning.
BTW what is when the split size is greater than the cluster capacity?:)
And a task could use multiple splits.

My proposal would be to use the number of tasks the user proposes and then assign the splits to the tasks equally. Partitioning should either make a block partitioning, or a key partitioning over its hashcode. Afterwards the created files are assigned as a filesplit to the task.
                
      was (Author: thomas.jungblut):
    This patch adds missing Apache headers in the files and moves the new I/O usage to the BSPPeerImpl, therefore changes the API in BSP class.

Test cases are working without any errors.

In the next patch I'm going to add the partitioners.


In my opinion we should move all the I/O related classes into another package. BSP package is soo bloated..
                  
> Design a input and output system
> --------------------------------
>
>                 Key: HAMA-258
>                 URL: https://issues.apache.org/jira/browse/HAMA-258
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: HAMA-258_improved.patch, io_v01.patch, io_v02.patch, io_v03.patch, io_v04.patch
>
>
> This issue will handle the input and output system with data splitter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira