You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "tianjuan (JIRA)" <ji...@apache.org> on 2018/06/30 05:12:00 UTC
[jira] [Comment Edited] (YARN-8193) YARN RM hangs abruptly (stops
allocating resources) when running successive applications.
[ https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528549#comment-16528549 ]
tianjuan edited comment on YARN-8193 at 6/30/18 5:11 AM:
---------------------------------------------------------
attaching patch for 2.9.0
was (Author: jutia):
attaching patching for 2.9.0
> YARN RM hangs abruptly (stops allocating resources) when running successive applications.
> -----------------------------------------------------------------------------------------
>
> Key: YARN-8193
> URL: https://issues.apache.org/jira/browse/YARN-8193
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Zian Chen
> Assignee: Zian Chen
> Priority: Critical
> Fix For: 2.9.0, 3.2.0, 3.1.1
>
> Attachments: YARN-8193-branch-2.9.0-001.patch, YARN-8193.001.patch, YARN-8193.002.patch
>
>
> When running massive queries successively, at some point RM just hangs and stops allocating resources. At the point RM get hangs, YARN throw NullPointerException at RegularContainerAllocator.getLocalityWaitFactor.
> There's sufficient space given to yarn.nodemanager.local-dirs (not a node health issue, RM didn't report any node being unhealthy). There is no fixed trigger for this (query or operation).
> This problem goes away on restarting ResourceManager. No NM restart is required.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org