You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Szilard Nemeth (JIRA)" <ji...@apache.org> on 2018/11/19 09:33:00 UTC

[jira] [Created] (YARN-9035) Allow better troubleshooting of FS container assignments and lack of container assignments

Szilard Nemeth created YARN-9035:
------------------------------------

             Summary: Allow better troubleshooting of FS container assignments and lack of container assignments
                 Key: YARN-9035
                 URL: https://issues.apache.org/jira/browse/YARN-9035
             Project: Hadoop YARN
          Issue Type: Improvement
            Reporter: Szilard Nemeth
            Assignee: Szilard Nemeth


The call chain started from {{FairScheduler.attemptScheduling}}, to {{FSQueue}} (parent / leaf).assignContainer and down to {{FSAppAttempt#assignContainer}} has many calls and has many potential conditions where {{Resources.none()}} can be returned, meaning container is not allocated.
A bunch of these empty-assignments do not come with a debug log statement, so it's very hard to tell what condition lead the {{FairScheduler}} to a decision where containers are not allocated.
On top of that, in many places, it's difficult to tell either why a container was allocated to an app attempt.

The goal is to have a common place (i.e. class) that will do all the loggings, so users conveniently can control all the logs if they are curious why (and why not) container assigments happened.
Also, it would be handy if readers of the log could easily decide which {{AppAttempt}} is the log record created for, in other words: every log record should include the ID of the application / app attempt, if possible.

 

Details of implementation: 
As most of the already in-place debug messages were protected by a condition that checks whether the debug level is enabled on loggers, I followed a similar pattern. All the relevant log messages are created with the class {{ResourceAssignment}}. 
This class is a wrapper for the assigned {{Resource}} object and has a single logger, so clients should use its helper methods to create log records. There is a helper method called \{{shouldLogReservationActivity}} that checks if DEBUG or TRACE level is activated on the logger. 
See the javadoc on this class for further information.
{{}}

 

{{ResourceAssignment}} is also responsible for adding the app / appettempt ID to every log record (with some exceptions).
A couple of check classes are introduced: They are responsible to run and store results of checks that are dependency of a successful container allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org