You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2009/01/07 12:30:44 UTC

[jira] Commented: (HADOOP-4988) An earlier fix, for HADOOP-4373, results in a problem with reclaiming capacity when one or more queues have a capacity equal to zero

    [ https://issues.apache.org/jira/browse/HADOOP-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661526#action_12661526 ] 

Vivek Ratan commented on HADOOP-4988:
-------------------------------------

A detailed explanation: in TaskSchedulingMgr.reclaimCapacity, we stop looking for capacity to reclaim if no queue is running over capacity. This we determine by looking at the last queue and checking if its number of running tasks is <= its gc. If we place queues with gc=0 at the end of a queue, this condition is true and we stop looking for capacity to reclaim at the first pass itself. 

Queues with gc=0 should be treated the same as queues with (# of running tasks == gc). 

> An earlier fix, for HADOOP-4373, results in a problem with reclaiming capacity when one or more queues have a capacity equal to zero
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4988
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4988
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>            Priority: Blocker
>
> HADOOP-4373 introduced a fix for queues with guaranteed capacity (gc) equal to zero. Part of the fix was in the queue comparator used to sort queues. Queues with gc=0 were placed at the end. This causes a problem with the code for reclaiming capacity, which assumes that queues are sorted based on free space available and that a queue with gc=0 is no different than a queue which is running at capacity. Because of this, the following problem can arise: if we have a system with at least one queue whose gc=0, we may fail to reclaim capacity for some queues. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.