You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2008/04/03 23:11:24 UTC

[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working with comma separated input paths

    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585287#action_12585287 ] 

hairong edited comment on HADOOP-3162 at 4/3/08 2:11 PM:
---------------------------------------------------------------

The problem was caused by the patch to HADOOP-3064. HADOOP-3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name.  So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb". 
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".

I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}

      was (Author: hairong):
    The problem was caused by the patch to HADOOP-3064. HADOOP_3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name.  So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb". 
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".

I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}
  
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.