You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2009/06/19 02:38:07 UTC

[jira] Created: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
----------------------------------------------------------------------------

                 Key: HADOOP-6082
                 URL: https://issues.apache.org/jira/browse/HADOOP-6082
             Project: Hadoop Core
          Issue Type: New Feature
          Components: mapred
            Reporter: dhruba borthakur
            Assignee: Namit Jain


There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

Posted by "Rodrigo Schmidt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721962#action_12721962 ] 

Rodrigo Schmidt commented on HADOOP-6082:
-----------------------------------------

I think we should make the jobtracker as light as possible. Dhruba's option sounds a bit better in that sense, since the burden of choosing the tasktrackers to be decommissioned is pushed to some external tool.

> Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-6082
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6082
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: Namit Jain
>
> There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721669#action_12721669 ] 

dhruba borthakur commented on HADOOP-6082:
------------------------------------------

Another option is to expose an APi from the JobTracker that  retrieves the status of all known slave nodes and the load (#running maps, #running reduces) on that slave node. (This will be equivalent to bin/hadoop dfsadmin -report command for HDFS).



> Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-6082
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6082
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: Namit Jain
>
> There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721607#action_12721607 ] 

Namit Jain commented on HADOOP-6082:
------------------------------------

The JobTracker needs a new API:

something like: 

decommissionTaskTrackers(int numberOfTT);

The TaskTracker may need to expose a new API to return its current load.

The JobTracker will get the current load of each TaskTracker and then decide to decommission the
most lightly loaded 'n' takstrackers.

When a TaskTracker is being decommissioned, it will stop accepting new jobs, and will
die when all the current jobs are finished. This may lead to wastage of resources in the cluster.

The jobtracker can optionally pass a timeout after which the tasktracker will definitely die.
At that time, it might be a good idea to increase the number of retires for the tasks being
executed.

The UI may neeed to be changed to show the new status of the task tracker as well.

> Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-6082
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6082
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: Namit Jain
>
> There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6082) Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721672#action_12721672 ] 

dhruba borthakur commented on HADOOP-6082:
------------------------------------------

Also, we might not need any changes to the decommissioning feature because when a node is decommissioned, currently running tasks are "killed" and not "failed", so they do increase the probability of a job failure.

> Elegant decommission of lighty loaded tasktrackers from a map-reduce cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-6082
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6082
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: Namit Jain
>
> There is a need to elegantly move some machines from one map-reduce cluster to another. This JIRA is to discuss how to find lightly loaded tasktrackers that are candidates for decommissioning and then to elegantly decommission them by waiting for existing tasks to finish.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.