You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Tao Jie (JIRA)" <ji...@apache.org> on 2017/03/03 07:57:45 UTC

[jira] [Commented] (YARN-6042) Dump scheduler and queue state information into FairScheduler DEBUG log

    [ https://issues.apache.org/jira/browse/YARN-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893880#comment-15893880 ] 

Tao Jie commented on YARN-6042:
-------------------------------

Hi [~yufeigu], dumping scheduler/queue state is very useful to detect scheduling problem at run-time. It seems to me that you try write scheduler/queue information to log file. How about print this information on the webui, just like we can get server stacks by a link. 

> Dump scheduler and queue state information into FairScheduler DEBUG log
> -----------------------------------------------------------------------
>
>                 Key: YARN-6042
>                 URL: https://issues.apache.org/jira/browse/YARN-6042
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: YARN-6042.001.patch, YARN-6042.002.patch, YARN-6042.003.patch, YARN-6042.004.patch, YARN-6042.005.patch, YARN-6042.006.patch, YARN-6042.007.patch, YARN-6042.008.patch
>
>
> To improve the debugging of scheduler issues it would be a big improvement to be able to dump the scheduler state into a log on request. 
> The Dump the scheduler state at a point in time would allow debugging of a scheduler that is not hung (deadlocked) but also not assigning containers. Currently we do not have a proper overview of what state the scheduler and the queues are in and we have to make assumptions or guess
> The scheduler and queue state needed would include (not exhaustive):
> - instantaneous and steady fair share (app / queue)
> - AM share and resources
> - weight
> - app demand
> - application run state (runnable/non runnable)
> - last time at fair/min share



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org