You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2006/02/16 02:01:44 UTC

[jira] Created: (HADOOP-39) Job killed when backup tasks fail

Job killed when backup tasks fail
---------------------------------

         Key: HADOOP-39
         URL: http://issues.apache.org/jira/browse/HADOOP-39
     Project: Hadoop
        Type: Bug
  Components: mapred  
    Reporter: Owen O'Malley


I had a map job with side effects that meant that any speculative tasks would fail.

Currently, the job tracker kills the job when the speculative task fails 4 times.

It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-39) Job killed when backup tasks fail

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-39?page=comments#action_12423760 ] 
            
Owen O'Malley commented on HADOOP-39:
-------------------------------------

My goal with this would be to do the equivalent of "make -k" or a "best effort" job. It the option was set, the job would continue after a given TIP had failed 4 times, but that TIP would be abandoned.

> Job killed when backup tasks fail
> ---------------------------------
>
>                 Key: HADOOP-39
>                 URL: http://issues.apache.org/jira/browse/HADOOP-39
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>
> I had a map job with side effects that meant that any speculative tasks would fail.
> Currently, the job tracker kills the job when the speculative task fails 4 times.
> It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (HADOOP-39) Create a job-configurable best effort for job execution

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley reassigned HADOOP-39:
-----------------------------------

    Assignee: Arun C Murthy  (was: Owen O'Malley)

> Create a job-configurable best effort for job execution
> -------------------------------------------------------
>
>                 Key: HADOOP-39
>                 URL: https://issues.apache.org/jira/browse/HADOOP-39
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Arun C Murthy
>
> I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-39) Create a job-configurable best effort for job execution

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-39:
--------------------------------

    Description: I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.  (was: I had a map job with side effects that meant that any speculative tasks would fail.

Currently, the job tracker kills the job when the speculative task fails 4 times.

It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.)
        Summary: Create a job-configurable best effort for job execution  (was: Job killed when backup tasks fail)

> Create a job-configurable best effort for job execution
> -------------------------------------------------------
>
>                 Key: HADOOP-39
>                 URL: https://issues.apache.org/jira/browse/HADOOP-39
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>
> I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-39) Create a job-configurable best effort for job execution

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy resolved HADOOP-39.
---------------------------------

    Resolution: Duplicate

Fixed as a part of HADOOP-1144

> Create a job-configurable best effort for job execution
> -------------------------------------------------------
>
>                 Key: HADOOP-39
>                 URL: https://issues.apache.org/jira/browse/HADOOP-39
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Arun C Murthy
>
> I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-39) Job killed when backup tasks fail

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-39?page=comments#action_12366674 ] 

Doug Cutting commented on HADOOP-39:
------------------------------------

The point is to try to get map tasks with side effects to sometimes succeed, even with speculative execution?  That sounds like it could be a bad idea.  Wouldn't it be better to have map tasks with side effects fail more frequently with speculative execution, so that you find such problems sooner, with smaller datasets on a smaller cluster, before you try a big run?  Or am I misunderstanding you?

> Job killed when backup tasks fail
> ---------------------------------
>
>          Key: HADOOP-39
>          URL: http://issues.apache.org/jira/browse/HADOOP-39
>      Project: Hadoop
>         Type: Bug
>   Components: mapred
>     Reporter: Owen O'Malley

>
> I had a map job with side effects that meant that any speculative tasks would fail.
> Currently, the job tracker kills the job when the speculative task fails 4 times.
> It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira