You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2007/05/03 01:34:16 UTC

[jira] Created: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Tasktracker blacklist leads to hung jobs in single-node cluster
---------------------------------------------------------------

                 Key: HADOOP-1322
                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
            Reporter: Arun C Murthy
         Assigned To: Arun C Murthy
             Fix For: 0.13.0


Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 

The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Status: Patch Available  (was: Open)

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493258 ] 

Hadoop QA commented on HADOOP-1322:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12356679/HADOOP-1322_20070503_2.patch applied and successfully tested against trunk revision r534624.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/107/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/107/console

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1322:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Arun!

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493485 ] 

Doug Cutting commented on HADOOP-1322:
--------------------------------------

I forgot to include the bug ID in the commit message.  So, for the record, this is in revision 534973.

http://svn.apache.org/viewvc?view=rev&revision=534973

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Priority: Critical  (was: Major)

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Attachment: HADOOP-1322_20070503_2.patch

This one should be it: check to ensure too many trackers aren't on the blacklist before rejecting the tracker.

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Status: Patch Available  (was: Open)

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493251 ] 

Hadoop QA commented on HADOOP-1322:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12356678/HADOOP-1322_20070503_1.patch applied and successfully tested against trunk revision r534624.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/106/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/106/console

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch, HADOOP-1322_20070503_2.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Status: Open  (was: Patch Available)

Koji pointed out another corner case: 
Two tasktrackers in the cluster. one is put to blacklist. and then another one goes down... this patch won't handle that.

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1322) Tasktracker blacklist leads to hung jobs in single-node cluster

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1322:
----------------------------------

    Attachment: HADOOP-1322_20070503_1.patch

Simple fix.

> Tasktracker blacklist leads to hung jobs in single-node cluster
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1322
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1322_20070503_1.patch
>
>
> Post HADOOP-1278, adding _the_ tasktracker to the blacklist in single-node clusters leads to a situation where the entire cluster is 'flaky' and no trackers are available to execute tasks. 
> The straight-forward fix to check before adding the tracker to the blacklist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.