You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon (JIRA)" <ji...@apache.org> on 2015/04/14 14:50:12 UTC

[jira] [Created] (HAMA-949) File splits based on number of input files

Edward J. Yoon created HAMA-949:
-----------------------------------

             Summary: File splits based on number of input files
                 Key: HAMA-949
                 URL: https://issues.apache.org/jira/browse/HAMA-949
             Project: Hama
          Issue Type: Improvement
    Affects Versions: 0.6.4
            Reporter: Edward J. Yoon
            Assignee: Edward J. Yoon
             Fix For: 0.7.0


I've create multiple input files considering max task capacity of cluster, but it wasn't able to run. Because, currently file splits are determined based on number of blocks. 

I don't know why below code has been removed. What if add this again?

{code}
    // take the short circuit path if we have already partitioned
    if (numSplits == files.length) {
      for (FileStatus file : files) {
        if (file != null) {
          splits.add(new FileSplit(file.getPath(), 0, file.getLen(),
              new String[0]));
        }
      }
      return splits.toArray(new FileSplit[splits.size()]);
    }
{code}
https://www.mail-archive.com/commits@hama.apache.org/msg00319.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)