You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2007/03/22 06:10:32 UTC

[jira] Created: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
------------------------------------------------------------------------------------------------

                 Key: HADOOP-1144
                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
    Affects Versions: 0.12.0
            Reporter: Christian Kunz
             Fix For: 0.13.0


In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492634 ] 

Arun C Murthy commented on HADOOP-1144:
---------------------------------------

I'd like to propose a 'mapred.task.failures.percent' config knob which is set to 0 by default - implies that *any* failed TIP leads to the job being declared as a failure i.e. the current behaviour. This could be set to '100' which then leads to 'best-effort' kind of job where TIPs which fail 4 times are abandoned as-per HADOOP-39 and should also satisfy requirements that Christian and Andrzej describe.  

I'm not entirely comfortable with HADOOP-39 in itself since it does not address the situation where a user might not want the job to run to completion when more than, say 50%, of maps fail.

W.r.t. to the interface to determine failures what do others think they need from it? Would it be useful, for e.g., to get the details of the 'input splits' of the TIPs which failed? Anything else?

Thoughts?

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492727 ] 

Doug Cutting commented on HADOOP-1144:
--------------------------------------

bq. Christian: could be made configurable separately for mappers and reducers

I agree that it makes sense to have separate parameters for map and reduce, something like mapred.max.map.failures.percent and mapred.max.reduce.failures.percent.  These should be settable from JobConf.

bq. Owen: need [..] an interface to determine how many of the maps and reduces failed.

Could we use counters for this?


> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492636 ] 

Andrzej Bialecki  commented on HADOOP-1144:
-------------------------------------------

+1 for mapred.task.failures.percent. Another scenario where we might want to salvage partially completed map tasks is when they work with corrupted input data. I.e. it would be nice to have a similar knob for InputFormat, so that it tolerates data that causes RecordReader to throw an exception without failing the TIP - i.e. to treat such errors as a regular end of input data.

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1144:
----------------------------------

    Status: Patch Available  (was: Open)

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1144_20070503_1.patch
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley reassigned HADOOP-1144:
-------------------------------------

    Assignee: Arun C Murthy

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493649 ] 

Hadoop QA commented on HADOOP-1144:
-----------------------------------

Integrated in Hadoop-Nightly #78 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/78/)

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1144_20070503_1.patch
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1144:
----------------------------------

    Attachment: HADOOP-1144_20070503_1.patch

Here is a patch which allows a configurable no. of tasks to fail before declaring the job to be a 'failure'. The no. of failed tasks can also be accessed via the per-job counters.

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1144_20070503_1.patch
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492656 ] 

Arun C Murthy commented on HADOOP-1144:
---------------------------------------

> It would be nice to have a similar knob for InputFormat, so that it tolerates data that causes RecordReader to throw an exception without failing the TIP - i.e. to treat such errors as a regular end of input data.

I'd rather have the user implement a simple sub-class of the RecordReader in question to ignore the exception and return 'false' from next(key, value) - that should be very easy, no?

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1144:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Arun!

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1144_20070503_1.patch
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492773 ] 

Andrzej Bialecki  commented on HADOOP-1144:
-------------------------------------------

bq. I'd rather have the user implement a simple sub-class of the RecordReader in question to ignore the exception and return 'false' from next(key, value) - that should be very easy, no?

Yes, it would - however, usually you are dealing with the same application, and changing data, and the data in most cases is valid. So it's easier to be able to accept corrupted input data for a single job by turning a config knob rather than re-implementing all your InputFormat-s ...

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492697 ] 

Christian Kunz commented on HADOOP-1144:
----------------------------------------

+1 for mapred.task.failures.percent (obviously).
I just added HADOOP-1304 asking for configurable number of retries for mappers and reducers. In this context, it would be even better if mapred.task.failures.percent could be made configurable separately for mappers and reducers (in our environment there is usually some tolerance for mapper failures, but zero tolerance for reducer failures).

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492625 ] 

Owen O'Malley commented on HADOOP-1144:
---------------------------------------

I would rather take the approach of HADOOP-39 and have a configuration that allows a job to have tips fail 4 times with out killing the rest of the job. There also needs to be an interface to determine how many of the maps and reduces failed.

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484207 ] 

Andrzej Bialecki  commented on HADOOP-1144:
-------------------------------------------

Nutch could use this feature too - it's quite common that one of the map tasks, which is e.g. parsing a difficult content like PDF or msdoc, crashes or gets stuck. This should not be fatal to the whole job.

As for the configuration of the number of failed tasks - I think it would be good to be able to choose between an absolute number or a percentage.

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493264 ] 

Hadoop QA commented on HADOOP-1144:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12356676/HADOOP-1144_20070503_1.patch applied and successfully tested against trunk revision r534624.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/108/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/108/console

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1144_20070503_1.patch
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1144) Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493011 ] 

Owen O'Malley commented on HADOOP-1144:
---------------------------------------

I guess I'm ok with the mapred.max.{map,reduce}.failures.percent, although if we are trying to make the names somewhat hierarchical, it should be more like: mapred.task.{map,reduce}.percent-failures.max or some such.

Using counters to count failed tips would make sense, since we already have the infrastructure to get it. Does the JobClient let you get the counters for individual tips?

I believe there is already a bug to have the framework skip bad records. That would be a better solution, in my opinion since it handles input and processing.

> Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1144
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Christian Kunz
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> In our environment it can occur that some map tasks will fail repeatedly because of corrupt input data, which sometimes is non-critical as long as the amount is limited. In this case it is annoying that the whole Hadoop job fails and cannot be restarted till the corrupt data are identified and eliminated from the input. It would be extremely helpful if the job configuration would allow to indicate how many map tasks are allowed to fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.