You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Yuqi Wang (JIRA)" <ji...@apache.org> on 2018/03/08 05:42:00 UTC
[jira] [Created] (YARN-8012) Support Unmanaged Container Cleanup
Yuqi Wang created YARN-8012:
-------------------------------
Summary: Support Unmanaged Container Cleanup
Key: YARN-8012
URL: https://issues.apache.org/jira/browse/YARN-8012
Project: Hadoop YARN
Issue Type: New Feature
Components: nodemanager
Affects Versions: 2.7.1
Reporter: Yuqi Wang
Assignee: Yuqi Wang
Fix For: 2.7.1
An *unmanaged container* is a container which is no longer managed by NM. Thus, it is cannot be managed by YARN, too.
*There are many cases a YARN managed container can become unmanaged, such as:*
# For container resource managed by YARN, such as container job object
and disk data:
** NM service is disabled or removed on the node.
** NM is unable to start up again on the node, such as depended configuration, or resources cannot be ready.
** NM local leveldb store is corrupted or lost, such as bad disk sectors.
** NM has bugs, such as wrongly mark live container as complete.
# For container resource unmanaged by YARN:
** User breakaway processes from container job object.
** User creates VMs from container job object.
** User acquires other resource on the machine which is unmanaged by
YARN, such as produce data outside Container folder.
*Bad impacts of unmanaged container, such as:*
# Resource cannot be managed for YARN and the node:
** Cause YARN and node resource leak
** Cannot kill the container to release YARN resource on the node
# Container and App killing is not eventually consistent for user:
** App which has bugs can still produce bad impacts to outside even if the App is killed for a long time
*Initial patch for review:*
For the initial patch, the unmanaged container cleanup feature on Windows, only can cleanup the container job object of the unmanaged container. Cleanup for more container resources will be supported. And the UT will be added if the design is agreed.
The current container will be considered as unmanaged when:
# NM is dead:
** Failed to check whether container is managed by NM within timeout.
# NM is alive but container is
org.apache.hadoop.yarn.api.records.ContainerState#COMPLETE
or not found:
** The container is org.apache.hadoop.yarn.api.records.ContainerState#COMPLETE or
not found in the NM container list.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org