You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "praveen sripati (JIRA)" <ji...@apache.org> on 2012/05/22 06:49:41 UTC

[jira] [Commented] (HAMA-561) Hama should support support consuming partitioned files

    [ https://issues.apache.org/jira/browse/HAMA-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280720#comment-13280720 ] 

praveen sripati commented on HAMA-561:
--------------------------------------

The input partitioned files can be of the following format

<filename>-0
<filename>-1
<filename>-2
<filename>-3
....
....
....
<filename>-n

<filename>-0 is assigned to bsp task with id 0 for processing and so on. BSPJob.set("InputFilesPartitioned", true); which defaults to false can be used to specify that the input files have been partitioned. 

Also, when the input files have been partitioned, the framework has to make sure that the partitioner class corresponding to the partitioned input files has been specified, so that the bsp tasks can send messages to the appropriate bsp task. If the specified partitioner class and the logic behind the partitioning of the input files doesn't match then the results unpredictable.

Also, if InputFilesPartitioned parameter is not specified (defaulted to false) and the partitioner class is specified, then Hama does the partitioning.
                
> Hama should support support consuming partitioned files
> -------------------------------------------------------
>
>                 Key: HAMA-561
>                 URL: https://issues.apache.org/jira/browse/HAMA-561
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp core
>    Affects Versions: 0.4.0
>            Reporter: praveen sripati
>            Priority: Minor
>
> Current the input partitioning is done when the job is submitted and the partitioner has been specified. There might be a scenario where the input data has already been partiononed or there might be a better way of partioning of the input data. So, Hama should be made aware that the files are already partitioned files and the messages should only be sent to the appropriate bsp task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira