You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Varun Vasudev (JIRA)" <ji...@apache.org> on 2015/03/19 15:25:38 UTC

[jira] [Commented] (YARN-2901) Add errors and warning stats to RM, NM web UI

    [ https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369371#comment-14369371 ] 

Varun Vasudev commented on YARN-2901:
-------------------------------------

Some more information on the patch - it adds a new log appender which stores a count of the errors and warnings logged with associated time period information. It also stores the last 500 unique error and the last 500 unique warning messages as well as the number of times and the time when they occurred. Messages more than 24 hours old are purged automatically from the store. Right now the appender is added in code since log4j.properties is shared across hadoop but if it's useful, we could add it to log4j.properties itself.

> Add errors and warning stats to RM, NM web UI
> ---------------------------------------------
>
>                 Key: YARN-2901
>                 URL: https://issues.apache.org/jira/browse/YARN-2901
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Screen Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch
>
>
> It would be really useful to have statistics on the number of errors and warnings in the RM and NM web UI. I'm thinking about -
> 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
> 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 hours/day
> By errors and warnings I'm referring to the log level.
> I suspect we can probably achieve this by writing a custom appender?(I'm open to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)