You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Herman Chen (JIRA)" <ji...@apache.org> on 2012/06/01 23:33:24 UTC
[jira] [Created] (MAPREDUCE-4304) Deadlock where all containers are
held by ApplicationMasters should be prevented
Herman Chen created MAPREDUCE-4304:
--------------------------------------
Summary: Deadlock where all containers are held by ApplicationMasters should be prevented
Key: MAPREDUCE-4304
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4304
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv2, resourcemanager
Affects Versions: 0.23.1
Reporter: Herman Chen
In my test cluster with 4 NodeManagers, each with only ~1.6G container memory, when a burst of jobs, e.g. >10, are concurrently submitted, it is likely that 4 jobs are accepted, with 4 ApplicationMasters allocated, but then the jobs block each other indefinitely because they're all waiting to allocate more containers.
Note that the problem is not limited to tiny cluster like this. As long as the number of jobs being submitted is greater than the rate jobs finish, it may run into a vicious cycle where more and more containers are locked up by ApplicationMasters.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira