You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2013/03/29 23:55:16 UTC

[jira] [Commented] (YARN-366) Add a tracing async dispatcher to simplify debugging

    [ https://issues.apache.org/jira/browse/YARN-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617838#comment-13617838 ] 

Sandy Ryza commented on YARN-366:
---------------------------------

Thanks for the review, Vinod.  I uploaded a new patch that incorporates your comments.  The patch also now includes intermediate methods in the trace, so if handler1.handle() calls childMethod1(), which calls childMethod2(), which dispatches an event, all three of handle(), childMethod1(), and childMethod2() will be included in the trace for the event. 
                
> Add a tracing async dispatcher to simplify debugging
> ----------------------------------------------------
>
>                 Key: YARN-366
>                 URL: https://issues.apache.org/jira/browse/YARN-366
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>    Affects Versions: 2.0.2-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: YARN-366-1.patch, YARN-366.patch
>
>
> Exceptions thrown in YARN/MR code with asynchronous event handling do not contain informative stack traces, as all handle() methods sit directly under the dispatcher thread's loop.
> This makes errors very difficult to debug for those who are not intimately familiar with the code, as it is difficult to see which chain of events caused a particular outcome.
> I propose adding an AsyncDispatcher that instruments events with tracing information.  Whenever an event is dispatched during the handling of another event, the dispatcher would annotate that event with a pointer to its parent.  When the dispatcher catches an exception, it could reconstruct a "stack" trace of the chain of events that led to it, and be able to log something informative.
> This would be an experimental feature, off by default, unless extensive testing showed that it did not have a significant performance impact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira