You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Ahmed Hussein (Jira)" <ji...@apache.org> on 2020/11/05 17:10:00 UTC

[jira] [Resolved] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

     [ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ahmed Hussein resolved YARN-10483.
----------------------------------
    Release Note: Please create Jiras that makes it easy for other developers to search and understand. 
      Resolution: Information Provided

> yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复
> ----------------------------------
>
>                 Key: YARN-10483
>                 URL: https://issues.apache.org/jira/browse/YARN-10483
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler, resourcemanager, RM
>    Affects Versions: 3.1.1
>            Reporter: jufeng li
>            Priority: Blocker
>         Attachments: RM_normal_state.stack, RM_unnormal_state.stack
>
>
> yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity scheduler内部的锁出了问题。正常状态下和卡住状态下rm的jstack日志已上传,希望有人可以解决一下,此bug比较严重,直接导致生产不可用。没人解答待会我再来问



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org