You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Robert Joseph Evans (JIRA)" <ji...@apache.org> on 2012/08/23 16:44:42 UTC

[jira] [Commented] (MAPREDUCE-4578) Handle container requests that request more resources than available in the cluster

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440344#comment-13440344 ] 

Robert Joseph Evans commented on MAPREDUCE-4578:
------------------------------------------------

I have thought about this before in relation to the 1.0 branch, for another JIRA that never really came to fruition.  In the situation you described the only way to really do this is with a timeout.  If there are large nodes, but for some reason they will "never" be available, the RM now has to predict what "never" really means.  A long lived task, even the AM itself, could be running on one of them, but it is the halting problem to predict if it is going to become available or not.  I would suggest that the timeout not be a global setting, because some applications may want to wait longer then others to get resources that will "never" show up.
                
> Handle container requests that request more resources than available in the cluster
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4578
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4578
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 0.23.0, 2.0.0-alpha
>            Reporter: Hitesh Shah
>
> In heterogenous clusters, a simple check at the scheduler to check if the allocation request is within the max allocatable range is not enough. 
> If there are large nodes in the cluster which are not available, there may be situations where some allocation requests will never be fulfilled. Need an approach to decide when to invalidate such requests. For application submissions, there will need to be a feedback loop for applications that could not be launched. For running AMs, AllocationResponse may need to augmented with information for invalidated/cancelled container requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira