You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2008/11/12 04:23:44 UTC

[jira] Commented: (HADOOP-4624) CapacityTaskScheduler.MapSchedulingMgr.killTasksFromJob() will not work as expected

    [ https://issues.apache.org/jira/browse/HADOOP-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646787#action_12646787 ] 

Vivek Ratan commented on HADOOP-4624:
-------------------------------------

The code does the right thing. It looks for maps that are non-local to any node in the cluster, then others. It assumes, however, that a job may have some non-local maps (i.e., the JobInProgress object's _nonLocalRunningMaps_ structure is not empty), as well as other running maps (in the _runningMapCache_ structure). Amar informs me that these two are mutually exclusive, i.e., a job will have one or the other structure empty. So, the right thing to do is modify the comment in CapacityTaskScheduler.killTasksFromJob() to reflect this, and wrap the calls to _job.getNonLocalRunningMaps()_ and _job.getRunningMapCache()_ in an if...then...else block. 

> CapacityTaskScheduler.MapSchedulingMgr.killTasksFromJob() will not work as expected
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4624
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4624
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Amar Kamat
>
> Once capacity-scheduler decides on killing tasks, it selects running-jobs from the queue and issues {{killTasksFromJob()}}. The order in which it kills is as follows
> - non-local maps
> - local maps
> _Killing non-local maps :_
> The code here uses {{JobInProgress.getNonLocalRunningMaps()}}. HADOOP-2119 introduced this for handling cases like _random-writer_. Hence this api will return an empty structure if there are reducers in the job. Hence the code fails to serve its purpose. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.