You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2019/08/20 16:09:00 UTC
[jira] [Comment Edited] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

    [ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911501#comment-16911501 ] 

Weiwei Yang edited comment on YARN-8995 at 8/20/19 4:08 PM:
------------------------------------------------------------

Hi [~zhuqi]/[~Tao Yang]

Thanks for working on this. Patch LGTM, I might be just a little picky on the configuration name, right now it is not straightforward to me.

"The interval of queue size (in thousands) for printing the boom queue event type details."

How about something like the following for the description, if I understand this correctly:

"The threshold used to trigger the logging of event types and counts in RM's main event dispatcher. Default length is 5000, which means RM will print events info when the queue size cumulatively reaches 5000 every time.  Such info can be used to reveal what kind of events that RM is stuck at processing mostly, it can help to narrow down certain performance issues."

And also, the config name is better to be something like {{yarn.dispatcher.print-events-info.threshold}}, you don't need to use in-thousands here, as several thousand is still human-readable.

Does that make sense?

Thanks


was (Author: cheersyang):
Hi [~zhuqi]/[~Tao Yang]

Thanks for working on this. Patch LGTM, I might be just a little picky on the configuration name, right now it is not straightforward to me.
{noformat}
The interval of queue size (in thousands) for printing the boom queue event type details.
{noformat}
How about something like the following for the description, if I understand this correctly:
{noformat}
The threshold used to trigger the logging of event types and counts in RM's main event dispatcher. Default length is 5000, which means RM will print events info when the queue size cumulatively reaches 5000 every time.  Such info can be used to reveal what kind of events that RM is stuck at processing mostly, it can help to narrow down certain performance issues.
{noformat}
And also, the config name is better to be something like {{yarn.dispatcher.print-events-info.threshold}}, you don't need to use in-thousands here, as several thousand is still human-readable.

Does that make sense?

Thanks

> Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics. 
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8995
>                 URL: https://issues.apache.org/jira/browse/YARN-8995
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: metrics, nodemanager, resourcemanager
>    Affects Versions: 3.2.0, 3.3.0
>            Reporter: zhuqi
>            Assignee: zhuqi
>            Priority: Major
>         Attachments: TestStreamPerf.java, YARN-8995.001.patch, YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, YARN-8995.008.patch
>
>
> In our growing cluster，there are unexpected situations that cause some event queues to block the performance of the cluster, such as the bug of  https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to log the event type of the too big event queue size, and add the information to the metrics, and the threshold of queue size is a parametor which can be changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org