You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon (Created) (JIRA)" <ji...@apache.org> on 2011/11/16 15:07:51 UTC

[jira] [Created] (HAMA-476) Splitter doesn't work correctly

Splitter doesn't work correctly
-------------------------------

                 Key: HAMA-476
                 URL: https://issues.apache.org/jira/browse/HAMA-476
             Project: Hama
          Issue Type: Bug
          Components: bsp
    Affects Versions: 0.3.0
            Reporter: Edward J. Yoon
            Assignee: Edward J. Yoon
             Fix For: 0.4.0


- To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 

- Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated HAMA-476:
--------------------------------

    Attachment: patch.txt

here's more optimized code.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch.txt, patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169136#comment-13169136 ] 

Thomas Jungblut commented on HAMA-476:
--------------------------------------

Sure, but I don't see a solution to this without append release of HDFS.
Or you can schedule a MapReduce job to partition them.

bq. If a number of jobs are submitted concurrently,

This log is not needed in my opinion, let's move this to the BSPMaster server side, there it isn't buggy like this and is stored correctly in my opinion.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169143#comment-13169143 ] 

Edward J. Yoon commented on HAMA-476:
-------------------------------------

Why not data redistribution among peers at setup() step?
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon resolved HAMA-476.
---------------------------------

    Resolution: Fixed

seems work good. I just closed this.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch.txt, patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HAMA-476) Splitter doesn't work correctly

Posted by "ChiaHung Lin (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151228#comment-13151228 ] 

ChiaHung Lin edited comment on HAMA-476 at 11/16/11 2:29 PM:
-------------------------------------------------------------

>From what I discovered so far, the first one ideally can be achieved by applying tiling strategy. Then we can provide wrapper classes for user to access according to range requested. 
                
      was (Author: chl501):
    From what I discovered so far, this ideally can be achieved by applying tiling strategy. Then we can provide wrapper classes for user to access according to range requested. 
                  
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180920#comment-13180920 ] 

Hudson commented on HAMA-476:
-----------------------------

Integrated in Hama-Nightly #416 (See [https://builds.apache.org/job/Hama-Nightly/416/])
    HAMA-476 Splitter doesn't work correctly

edwardyoon : 
Files : 
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/BSPJobClient.java
* /incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/ShortestPaths.java

                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch.txt, patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168127#comment-13168127 ] 

Edward J. Yoon commented on HAMA-476:
-------------------------------------

NOTE: 

If a number of jobs are submitted concurrently, 

{code}
11/12/13 11:49:09 INFO bsp.FileInputFormat: Total input paths to process : 1
11/12/13 11:49:09 INFO bsp.FileInputFormat: Total # of splits: 42
11/12/13 11:52:02 INFO bsp.FileInputFormat: Total input paths to process : 42
11/12/13 11:52:02 INFO bsp.FileInputFormat: Total # of splits: 42
11/12/13 11:52:04 INFO bsp.BSPJobClient: Running job: job_201112131021_0003
11/12/13 11:52:07 INFO bsp.BSPJobClient: Launched tasks: 61/42
11/12/13 11:52:10 INFO bsp.BSPJobClient: Launched tasks: 84/42
11/12/13 12:03:55 INFO bsp.BSPJobClient: Launched tasks: 67/42
11/12/13 12:03:58 INFO bsp.BSPJobClient: Launched tasks: 42/42
{code}
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "ChiaHung Lin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151850#comment-13151850 ] 

ChiaHung Lin commented on HAMA-476:
-----------------------------------

Looks like I misunderstand the original question. What is concerned is users may request arbitrary forms of split blocks (not just contiguous). So basically we can provide a layer which allows users compose blocks they want (including contiguous), and on top of which wrapper classes e.g. read/ write records can serve for contiguous, etc. read/ write record request from users. 
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151305#comment-13151305 ] 

Thomas Jungblut commented on HAMA-476:
--------------------------------------

bq. To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize.

Correct, we have to split via the blocks.

bq. Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

There is no idea to have, we have to restrict more tasks than the cluster capacity. In YARN this issue is even worse, because you don't know the capacity.


bq. From what I discovered so far, the first one ideally can be achieved by applying tiling strategy. Then we can provide wrapper classes for user to access according to range requested.

How is this tiling gonna work without rewriting sequence files?
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169170#comment-13169170 ] 

Thomas Jungblut commented on HAMA-476:
--------------------------------------

How do you deal with data-locality? How should this work?

And remember, setup is for the user.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated HAMA-476:
--------------------------------

    Attachment: patch_01.txt

This patch adds simple logic to extract proper size of tasks in the max task capacity.

{code}
root@Cnode1:/usr/local/src/hama-trunk# core/bin/hama jar examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar sssp 3 result /user/root/sssp/sssp-small.seq 4
12/01/04 16:02:54 INFO bsp.FileInputFormat: Total input paths to process : 1
12/01/04 16:02:54 INFO bsp.FileInputFormat: Total # of splits: 2
12/01/04 16:03:03 INFO bsp.FileInputFormat: Total input paths to process : 4
12/01/04 16:03:03 INFO bsp.FileInputFormat: Total # of splits: 4
12/01/04 16:03:04 INFO bsp.BSPJobClient: Running job: job_201201041546_0005
12/01/04 16:03:07 INFO bsp.BSPJobClient: Launched tasks: 3/4
12/01/04 16:03:10 INFO bsp.BSPJobClient: Launched tasks: 4/4
12/01/04 16:03:19 INFO bsp.BSPJobClient: Current supersteps number: 23
12/01/04 16:03:22 INFO bsp.BSPJobClient: Current supersteps number: 44
12/01/04 16:03:25 INFO bsp.BSPJobClient: Current supersteps number: 84
12/01/04 16:03:28 INFO bsp.BSPJobClient: Current supersteps number: 104
12/01/04 16:03:31 INFO bsp.BSPJobClient: Current supersteps number: 125
12/01/04 16:03:34 INFO bsp.BSPJobClient: Current supersteps number: 147
{code}
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179347#comment-13179347 ] 

Thomas Jungblut commented on HAMA-476:
--------------------------------------

Cool feature, but I guess we have to repartion the dataset to maxtask and add log warnings
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180149#comment-13180149 ] 

Edward J. Yoon commented on HAMA-476:
-------------------------------------

Test passed on my cluster. I'm commit this at the moment.

Let's consider more efficient re-partitioning in the next step.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch.txt, patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168961#comment-13168961 ] 

Edward J. Yoon commented on HAMA-476:
-------------------------------------

NOTE:

Currently partition() method incurs too many open files.
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180148#comment-13180148 ] 

Edward J. Yoon commented on HAMA-476:
-------------------------------------

if user not set task size, max size will be used. 
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: patch.txt, patch_01.txt
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-476) Splitter doesn't work correctly

Posted by "ChiaHung Lin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151228#comment-13151228 ] 

ChiaHung Lin commented on HAMA-476:
-----------------------------------

>From what I discovered so far, this ideally can be achieved by applying tiling strategy. Then we can provide wrapper classes for user to access according to range requested. 
                
> Splitter doesn't work correctly
> -------------------------------
>
>                 Key: HAMA-476
>                 URL: https://issues.apache.org/jira/browse/HAMA-476
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>
> - To split sequencefile as user requested size, there's no way to avoid read/write records. I think we have to use just blockSize. 
> - Unlike MapReduce, we are unable to queuing tasks when exceeds cluster capacity (I have no idea at the moment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira