You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Ikhtiyor Ahmedov (JIRA)" <ji...@apache.org> on 2013/07/25 07:31:52 UTC

[jira] [Comment Edited] (HAMA-781) Setting partition split fails in local mode when file size is big and has a runtime partition (HashParitioner)

    [ https://issues.apache.org/jira/browse/HAMA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719247#comment-13719247 ] 

Ikhtiyor Ahmedov edited comment on HAMA-781 at 7/25/13 5:31 AM:
----------------------------------------------------------------

Same code affects when multiple inputs given as input.
Code (line 560): 
{code:title=BSPJobClient.java}
// set partitionID to rawSplit
if (split.getClass().getName().equals(FileSplit.class.getName())
  && job.getConfiguration().get(Constants.RUNTIME_PARTITIONING_CLASS) != null
  && job.get("bsp.partitioning.runner.job") == null) {
  LOG.debug(((FileSplit) split).getPath().getName());
   String[] extractPartitionID = ((FileSplit) split).getPath().getName()
         .split("[-]");
   rawSplit.setPartitionID(Integer.parseInt(extractPartitionID[1]));
}
{code}
Exception:
{code}java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.hama.bsp.BSPJobClient.writeSplits(BSPJobClient.java:566)
	at org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:342)
	at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:293)
	at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:229)
	at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:236)
	at org.apache.hama.examples.OnlineCF.main(OnlineCF.java:427)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.hama.examples.ExampleDriver.main(ExampleDriver.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hama.util.RunJar.main(RunJar.java:146)
{code}
Example fail cases:
1) When input is non-hdfs format (part-0000, part-0001) and size is big (usually from local filesystem)
2) When input is given as multiple files in local mode: 
{code}SequenceFileInputFormat.addInputPaths(job, "/tmp/test.seq,/tmp/test2.seq,/tmp/test3.seq");{code}
                
      was (Author: ikhahmedov):
    Same code affects when multiple inputs given as input.
Code: 
{quote}
        // set partitionID to rawSplit
        if (split.getClass().getName().equals(FileSplit.class.getName())
            && job.getConfiguration().get(Constants.RUNTIME_PARTITIONING_CLASS) != null
            && job.get("bsp.partitioning.runner.job") == null) {
          LOG.debug(((FileSplit) split).getPath().getName());
          String[] extractPartitionID = ((FileSplit) split).getPath().getName()
              .split("[-]");
          rawSplit.setPartitionID(Integer.parseInt(extractPartitionID[1]));
        }
{quote}
Exception:
{quote}java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.hama.bsp.BSPJobClient.writeSplits(BSPJobClient.java:566)
	at org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:342)
	at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:293)
	at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:229)
	at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:236)
	at org.apache.hama.examples.OnlineCF.main(OnlineCF.java:427)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.hama.examples.ExampleDriver.main(ExampleDriver.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hama.util.RunJar.main(RunJar.java:146)
{quote}
Example fail cases:
1) When input is non-hdfs format (part-0000, part-0001) and size is big (usually from local filesystem)
2) When input is given as multiple files in local mode: 
{quote}SequenceFileInputFormat.addInputPaths(job, "/tmp/test.seq,/tmp/test2.seq,/tmp/test3.seq");{quote}
                  
> Setting partition split fails in local mode when file size is big and has a runtime partition (HashParitioner)
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HAMA-781
>                 URL: https://issues.apache.org/jira/browse/HAMA-781
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp core
>            Reporter: Ikhtiyor Ahmedov
>            Priority: Minor
>         Attachments: HAMA-781.patch
>
>
> when input partitioner set to HashPartitioner and file size is big in local mode; in line 566 of BSPJobClient.java throws index out of bound exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira