You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Devaraj K (JIRA)" <ji...@apache.org> on 2011/02/03 16:58:29 UTC

[jira] Created: (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

All map reduce tasks are failing if we give invalid path jar file for Job
-------------------------------------------------------------------------

                 Key: MAPREDUCE-2297
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.2
            Reporter: Devaraj K
            Priority: Minor


This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.


In hive-default.xml

<property>
<name>hive.aux.jars.path</name>
<value></value>
<description>Provided for adding auxillaryjarsPath</description>
</property>

If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
{code:xml} 
hive> select * from a join b on(a.b=b.c);
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)

{code} 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038433#comment-13038433 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-2297:
----------------------------------------------------

I too contest this. We don't have a concept of optional artifacts today.

Also, I am a little confused about your report. The exception trace is showing that job-submission itself is failing. But you mention 'all tasks are failing'.

It looks like hive.aux.jars.path should really be a job specific parameter. Otherwise if a new query(?) needs a new jar, it would mean a change in a system config? Seems weird, can you throw some more light on how this configuration is intended to be used.

If the suggestion is to just accept the job removing all invalid entries, but still keep the list of valid jars consistent through the rest of system, that should be okay I guess.

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-2297:
-----------------------------------

    Status: Open  (was: Patch Available)

cancelling patch for discussion

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039228#comment-13039228 ] 

Todd Lipcon commented on MAPREDUCE-2297:
----------------------------------------

hive.aux.jars.path can be set per-query with hive using the "set" command from the hive shell. I would not consider it system-level... or, if it's system level, it should be pointing to jars that are guaranteed to exist.

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037013#comment-13037013 ] 

Todd Lipcon commented on MAPREDUCE-2297:
----------------------------------------

bq. Consider the case of Hive, if we configure any invalid path for the property "hive.aux.jars.path" all the jobs will fail which is not using that jar also.

I would consider it a broken config if you configure hive.aux.jars.path to point to a jar which doesn't exist.

In your patch, if you accidentally make a typo in your DistributedCache entries, you'll see NoClassDefFound exceptions or other much scarier errors. I think it's better to fail with the "File does not exist" error during localization.

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K resolved MAPREDUCE-2297.
----------------------------------

    Resolution: Not A Problem

It is no more problem in active versions.
                
> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K reassigned MAPREDUCE-2297:
------------------------------------

    Assignee: Devaraj K

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038554#comment-13038554 ] 

Devaraj K commented on MAPREDUCE-2297:
--------------------------------------

As the exception stack trace says, I meant in the title all map reduce jobs are failing.  

"hive.aux.jars.path" is system level config for hive. Hive uses these jars for all the jobs it submits. I think there is no provision of submitting jars for specific query.


> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Venu Gopala Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038414#comment-13038414 ] 

Venu Gopala Rao commented on MAPREDUCE-2297:
--------------------------------------------

Hi Todd, I feel the fix will be useful. Let us just consider the case of Java, even one of the paths mentioned in classpath is invalid, Java will execute. As you mentioned  , if it is a typo error and a required jar is missing, the program may get ClassNotFoundExceptions.

Since hive.aux.jars.path is a system level configuration, we should not fail all jobs. But some jobs may get ClassNotFoundExceptions if a required jar is missing. 

What you say?

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated MAPREDUCE-2297:
---------------------------------

    Attachment: MAPREDUCE-2297.patch

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Priority: Minor
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036411#comment-13036411 ] 

Todd Lipcon commented on MAPREDUCE-2297:
----------------------------------------

I disagree with the premise of this issue. If you try to add a file to distributedcache, but it can't be localized, it makes complete sense that it should fail the task.

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated MAPREDUCE-2297:
---------------------------------

    Fix Version/s: 0.20.4
           Status: Patch Available  (was: Open)

Now if we give invalid archive path for a job it is logging and proceeding with other archives. 

> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2297) All map reduce tasks are failing if we give invalid path jar file for Job

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036831#comment-13036831 ] 

Devaraj K commented on MAPREDUCE-2297:
--------------------------------------

Hi Todd, I agree with you for job specific configuration. If we configure archives (contains invalid path) common for all jobs then all jobs will fail. Here all jobs may not depend on all the archives. 

Consider the case of Hive, if we configure any invalid path for the property "hive.aux.jars.path" all the jobs will fail which is not using that jar also.


> All map reduce tasks are failing if we give invalid path jar file for Job
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2297
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2297
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Minor
>             Fix For: 0.20.4
>
>         Attachments: MAPREDUCE-2297.patch
>
>
> This can be reproduced by giving the invalid jar file for the Job or it can be reproduced from hive.
> In hive-default.xml
> <property>
> <name>hive.aux.jars.path</name>
> <value></value>
> <description>Provided for adding auxillaryjarsPath</description>
> </property>
> If we configure an invalid path for jar file, It is making all map reduce tasks to fail even those jobs are not depending on this jar file and it is giving the below exception.
> {code:xml} 
> hive> select * from a join b on(a.b=b.c);
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> java.io.FileNotFoundException: File does not exist: /user/root/grade.jar
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:495)
> at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:651)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:783)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:698)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:64)
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira