You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2008/04/03 19:18:24 UTC

[jira] Created: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Map/reduce stops working with comma separated input paths
---------------------------------------------------------

                 Key: HADOOP-3162
                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Runping Qi
            Priority: Blocker



When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
/gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
QueryBlockCompressed/part-00002
        at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
t.java:213)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
        at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
rator.java:189)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586318#action_12586318 ] 

amareshwari edited comment on HADOOP-3162 at 4/7/08 10:14 PM:
--------------------------------------------------------------------------

This patch addresses the following:
1. Adds the following apis in FileInputFormat:
    * public static void setInputPaths(JobConf job, Path... paths);
    * public static void setInputPaths(JobConf job, String commaSeparatedPaths);
    * public static void addInputPath(JobConf job, Path path);
    * public static void addInputPaths(JobConf job, String commaSeparatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path). And the methods have 0.16 semantics
3. Moves testInputPath() from conf/TestJobConf to mapred/TestInputPath. Added tests for testing the new apis.


      was (Author: amareshwari):
    This patch addresses the following:
1. Adds the following apis in FileInputFormat:
    * public static void setInputPaths(JobConf job, Path... paths);
    * public static void setInputPaths(JobConf job, String commaSepatedPaths);
    * public static void addInputPath(JobConf job, Path path);
    * public static void addInputPaths(JobConf job, String commaSepatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)
3. Moves testInputPath() from conf/TestJobConf to mapred/TestInputPath. Added tests for testing the new apis.

  
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585371#action_12585371 ] 

Owen O'Malley commented on HADOOP-3162:
---------------------------------------

I propose that we make this mirror the change in HADOOP-3041 and add to FileInputFormat:

{code}
  public static void addInputPath(JobConf job, Path path);
  public static void setInputPaths(JobConf job, Path[] paths);
  public static Path[] getInputPaths(JobConf job);
{code}

while in JobConf we deprecate the input path methods and restore them to their 0.16 semantics (pre- HADOOP-3064, with no quoting and split on ',')

{code}
@Deprecated
public void addInputPath(Path p);
@Deprecated
public void setInputPath(Path p);
@Deprecated
public Path[] getInputPaths();
{code}

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The patch implemented then incorrectly.
The correct implementation should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}


When you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588739#action_12588739 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


The meaning of the comma in the user facing interface like streaming should not and need not change.
path1,path2 should continue mean two paths separated by comma.
It should not be changed. The comma should not be interpreted as a part of path.
It is either a separator of path or a separator in glob.
If we generalize the glob to accespt {a/b/,/d/e/f/,/g}, then we get a unified semantics of comma.
Until then, the above two api methods are needed.


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585508#action_12585508 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


Withthe new api, I assume that add/setInputPath accept a Path with a global pattern having multiple paths separated by comma, like:
{code}
new Path("{p1/f1,p2/f2}")
{code}
However, I'd prefer a new API method still hornor camma as the path separator
{code}
public static void setInputPaths(JobConf job, String commaSeparatedPaths);
{code}

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588892#action_12588892 ] 

Amareshwari Sriramadasu commented on HADOOP-3162:
-------------------------------------------------

Implementation is as follows:

{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    setInputPaths(job, paths);
}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    for each path in path[]
      addInputPath(job, path);
}
{code}

Here, setInputPaths replaces existing paths, but addInputPaths adds the values to the existsing input path list.


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3162:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585287#action_12585287 ] 

Hairong Kuang commented on HADOOP-3162:
---------------------------------------

The problem was caused by the patch to HADOOP-3064. HADOOP_3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name.  So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb". 
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".

I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589992#action_12589992 ] 

Hudson commented on HADOOP-3162:
--------------------------------

Integrated in Hadoop-trunk #463 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/463/])

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586914#action_12586914 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379638/patch-3162.txt
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 138 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2184/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2184/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2184/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2184/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Here is patch with Runping's comments incorporated. The following are the changes from earlier patch:
1. getEscapedPathString(String) is changed as getPathStrings(String). getPathStrings(commaSeparatedPaths) returns a string array of the paths in commaSeparatedPaths.  So, this avoids escaping and un-escaping to get splits. 
2. All the calls to add/setInputPath(new Path(String)) are replaced with add/setInputPaths(conf, String)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588719#action_12588719 ] 

Hairong Kuang commented on HADOOP-3162:
---------------------------------------

I do not think that we need add the following two methods to FileInputFormat:
1. setInputPaths(JobConf conf, Sring commaSeparatedPaths)
2. addInputPaths(JobConf conf, String commaSeparatedPaths).

We have not discussed what a comma means in the user facing interfaces like streaming if a user provides a comma separated path name. In streaming, should we support commas in a path name? What if a user wants to use a glob that contains a comma? These questions need to be well discussed and documented before we make any code change to support it. We could do it in release 18.

In release 17, the patch only needs to revert the old behavior of addInputPath and setInputPath of JobConf. Applications like streaming should continue to use JobConf.setInputPath while a user that needs to use a glob or a path containing commas can use the new APIs in FileInputFormat.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589577#action_12589577 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


the streaming cli should work with globing too with this patch.
the following should work:
{code}
...... -input a/b/c,d{e,f},g/h
{code}


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585291#action_12585291 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


When people need to pass inputs to a map/reduce through a command line, they commonly mormal use one argument for the multiple inputs, separated by comma (,) char. Then they will most likely call jobconf.setInputPath(inputs).

The patch 3064 broke a common practice of using map/reduce since day one.

 

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3162:
--------------------------------

    Affects Version/s: 0.17.0

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Priority: Blocker
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> /gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588254#action_12588254 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379981/patch-3162.txt
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 138 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2220/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2220/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2220/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2220/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

Added following apis in FileInputFormat
public static void addInputPath(JobConf job, Path path);
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String... strings);
public static Path[] getInputPaths(JobConf job);

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588866#action_12588866 ] 

Hairong Kuang commented on HADOOP-3162:
---------------------------------------

Runping, if streaming continues to use the deprecated API in JobConf, it keeps the old semantics. The two new methods break the backward compatibility. We should do it in release 18. We need more discussion on what's the right syntax and semantics for the comma seperated path names.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Indentation changed. And added a comment.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3162:
----------------------------------

    Status: Open  (was: Patch Available)

Amareshwari, could you please fix the javac warnings?

Minor nit: The private static method {{FileInputFormat.getEscapedPathString}} worries me a little bit, at the very least we should put it in StringUtils or someother place like that... at least other input formats which do not extend FileInputFormat can use it. What do others think?

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585287#action_12585287 ] 

hairong edited comment on HADOOP-3162 at 4/3/08 2:11 PM:
---------------------------------------------------------------

The problem was caused by the patch to HADOOP-3064. HADOOP-3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name.  So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb". 
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".

I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}

      was (Author: hairong):
    The problem was caused by the patch to HADOOP-3064. HADOOP_3064 allows that a path name contains a comma. Methods setInputPath and addInputPath of JobConf escape all the commas in the path name.  So before this change, if a user's map/reduce job has the following code:
{code}
jobConf.setInputPath(new Path("aa,bb")));
{code}
jobConf.getInputPaths returned an array of two paths: "aa" and "bb". 
After this patch, jobConf.getInputPaths returns an array of one path: "aa,bb".

I guess the failed job might have used jobConf.setInputPath(new Path("path1,path2")) to set two input paths. It should use the following statements instead:
{code}
jobConf.setInputPath(new Path("path1"));
jobConf.addInputPath(new Path("path2"));
{code}
  
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891 ] 

runping edited comment on HADOOP-3162 at 4/14/08 9:08 PM:
-------------------------------------------------------------

I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The implementation in the patch seems overly complicated.
I am not sure its correctness. .
The intended behavior should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}

Also,when you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. 

The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 

      was (Author: runping):
    
I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The patch implemented then incorrectly.
The correct implementation should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}


When you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 
  
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585322#action_12585322 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------


this is not a math problem:)


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

trying to run hudson again

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585319#action_12585319 ] 

Hairong Kuang commented on HADOOP-3162:
---------------------------------------

But this common practice of using a path parameter to set multiple input paths is wrong.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585703#action_12585703 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379389/patch-3162.txt
against trunk revision 643282.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 135 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2165/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2165/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2165/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2165/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Mukund Madhugiri (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585206#action_12585206 ] 

Mukund Madhugiri commented on HADOOP-3162:
------------------------------------------

The last successful run I saw was on trunk revision 642046

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Priority: Blocker
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> /gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sameer Paranjpye updated HADOOP-3162:
-------------------------------------

    Fix Version/s: 0.17.0
         Assignee: Amar Kamat

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> /gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-3162:
-------------------------------

    Description: 
When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
QueryBlockCompressed/part-00002
        at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
t.java:213)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
        at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
rator.java:189)


  was:
When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
/gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
QueryBlockCompressed/part-00002
        at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
t.java:213)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
        at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
rator.java:189)



> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

there is javadoc warning in the patch.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589245#action_12589245 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380174/patch-3162.txt
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 138 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2241/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2241/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2241/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2241/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Fixed javac warning. Warning was due to MultiFileWordCount.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu reassigned HADOOP-3162:
-----------------------------------------------

    Assignee: Amareshwari Sriramadasu  (was: Amar Kamat)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu reassigned HADOOP-3162:
-----------------------------------------------

    Assignee: Amareshwari Sriramadasu  (was: Cameron Pope)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Here is a patch adding apis Owen suggested. 
Deprecated the apis in JobConf.  restored them to their 0.16 semantics.
Added public static void setInputPaths(JobConf conf, String... strings)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586693#action_12586693 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379539/patch-3162.txt
against trunk revision 643282.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 138 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac -1.  The applied patch generated 499 javac compiler warnings (more than the trunk's current 498 warnings).

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2181/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2181/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2181/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2181/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589101#action_12589101 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------

The patch looks good.

+1.



> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Fixed javac warning.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment:     (was: patch-3162.txt)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585770#action_12585770 ] 

Runping Qi commented on HADOOP-3162:
------------------------------------

Since the inputSpecs of a streaming job comes from command lines, it 
may contain comma separated files (very common use).
In order to continue to  support that usage, it is better to change
{code}
FileInputFormat.addInputPath(jobConf_,   new Path(((String) inputSpecs_.get(i))));
{code}
into
{code}
FileInputFormat.addInputPaths(jobConf_,   (String) inputSpecs_.get(i));
{code}
(do we have 
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
If not, we need to add it.



> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Open  (was: Patch Available)

Looks like it skipped hudson queue. trying to run hudson again.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Release Note: 
The public methods org.apache.hadoop.mapred.JobConf.setInputPath(Path) and org.apache.hadoop.mapred.JobConf.addInputPath(Path) are deprecated. And the methods have the semantics of branch 0.16.
The following public APIs  are added in org.apache.hadoop.mapred.FileInputFormat :
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String commaSeparatedPaths);
public static void addInputPath(JobConf job, Path path);
public static void addInputPaths(JobConf job, String commaSeparatedPaths);
Earlier code calling JobConf.setInputPath(Path), JobConf.addInputPath(Path) should now call FileInputFormat.setInputPaths(JobConf, Path...) and FileInputFormat.addInputPath(Path) respectively

  was:
1 Adds the following APIs in FileInputFormat
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String commaSeparatedPaths);
public static void addInputPath(JobConf job, Path path);
public static void addInputPaths(JobConf job, String commaSeparatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588027#action_12588027 ] 

Hadoop QA commented on HADOOP-3162:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12379833/patch-3162.txt
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 138 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac -1.  The applied patch generated 498 javac compiler warnings (more than the trunk's current 497 warnings).

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2209/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2209/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2209/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2209/console

This message is automatically generated.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589442#action_12589442 ] 

Owen O'Malley commented on HADOOP-3162:
---------------------------------------

Except for the streaming part, I think this is the right call.

On the streaming cli processing, I'm pretty conflicted. On the one hand, we can't deprecate the cli, but without changing the semantics, we can't use the choice patterns. I think the best we can do is to use the current patch and make 0.18 streaming use the normal  path globbing instead of the comma separated paths set path.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Assignee: Cameron Pope  (was: Amareshwari Sriramadasu)
      Status: Open  (was: Patch Available)

Canceling patch to address Runping's comment.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Cameron Pope
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

Fixed javadoc warning

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Release Note: 
1 Adds the following APIs in FileInputFormat
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String commaSeparatedPaths);
public static void addInputPath(JobConf job, Path path);
public static void addInputPaths(JobConf job, String commaSeparatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)

  was:
1 Adds the following APIs in FileInputFormat
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String commaSepatedPaths);
public static void addInputPath(JobConf job, Path path);
public static void addInputPaths(JobConf job, String commaSepatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585665#action_12585665 ] 

Hairong Kuang commented on HADOOP-3162:
---------------------------------------

Just to make it clear that currently dfs does not support globs like {/p1/f1,/p2/f2}. Instead, it supports globs like /p1/{f1,f2}. This means that the closure pattern needs to be in one path component.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Attachment: patch-3162.txt

This patch addresses the following:
1. Adds the following apis in FileInputFormat:
    * public static void setInputPaths(JobConf job, Path... paths);
    * public static void setInputPaths(JobConf job, String commaSepatedPaths);
    * public static void addInputPath(JobConf job, Path path);
    * public static void addInputPaths(JobConf job, String commaSepatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)
3. Moves testInputPath() from conf/TestJobConf to mapred/TestInputPath. Added tests for testing the new apis.


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Cameron Pope
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588888#action_12588888 ] 

Amareshwari Sriramadasu commented on HADOOP-3162:
-------------------------------------------------

The newly added apis now takes care of both comma separated paths and glob paths. For example, if user gives commaSeparated string as a{b,c}d,ef,xyz. this is split as 3 paths:  1. a{b,c}d , 2. ef , 3. xyz and added to the input paths. So, this does not stop user from giving glob paths.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588072#action_12588072 ] 

Devaraj Das commented on HADOOP-3162:
-------------------------------------

bq. The private static method FileInputFormat.getEscapedPathString worries me a little bit, at the very least we should put it in StringUtils or someother place like that

Note that the path stuff applies only to FileInputFormat. So IMO this definition should be here rather than in some other place.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891 ] 

runping edited comment on HADOOP-3162 at 4/14/08 9:09 PM:
-------------------------------------------------------------

I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The implementation in the patch seems overly complicated.
I am not sure its correctness. .
The intended behavior should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}

Also,when you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
{code}
That is incorrect. 

The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 

      was (Author: runping):
    I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The implementation in the patch seems overly complicated.
I am not sure its correctness. .
The intended behavior should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}

Also,when you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. 

The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 
  
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-3162:
--------------------------------------------

    Release Note: 
1 Adds the following APIs in FileInputFormat
public static void setInputPaths(JobConf job, Path... paths);
public static void setInputPaths(JobConf job, String commaSepatedPaths);
public static void addInputPath(JobConf job, Path path);
public static void addInputPaths(JobConf job, String commaSepatedPaths);
2. Deprecates JobConf.setInputPath(Path) and JobConf.addInputPath(Path)
          Status: Patch Available  (was: Open)

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12587985#action_12587985 ] 

Devaraj Das commented on HADOOP-3162:
-------------------------------------

+1

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585519#action_12585519 ] 

Amareshwari Sriramadasu commented on HADOOP-3162:
-------------------------------------------------

bq. However, I'd prefer a new API method still hornor camma as the path separator
This is conficting with glob pattern requirement, right? Now, comma seperated paths can be added  with global pattern as you suggested.

We can add  the followign api along with the ones Owen suggested.
public static void setInputPaths(JobConf conf, String... strings)


> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Runping Qi updated HADOOP-3162:
-------------------------------

    Component/s: mapred
    Description: 
When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
/gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
QueryBlockCompressed/part-00002
        at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
t.java:213)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
        at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
rator.java:189)


  was:

When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
/gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
QueryBlockCompressed/part-00002
        at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
t.java:213)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
        at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
rator.java:189)





The problem started to happen on the hadoop 0.17 trunk after 3/27.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Priority: Blocker
>
> When a job is given a comma separated input file list, FileInputFormat class throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> /gs205234.inktomisearch.com:55638/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.