You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2008/04/03 23:11:24 UTC
[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working
with comma separated input paths
[ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585287#action_12585287 ]
hairong edited comment on HADOOP-3162 at 4/3/08 2:11 PM:
---------------------------------------------------------------
The problem was caused by the patch to HADOOP-3064. HADOOP-3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name. So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb".
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".
I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}
was (Author: hairong):
The problem was caused by the patch to HADOOP-3064. HADOOP_3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name. So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb".
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".
I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
> Key: HADOOP-3162
> URL: https://issues.apache.org/jira/browse/HADOOP-3162
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
> at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
> at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.