You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Mike Liddell (JIRA)" <ji...@apache.org> on 2013/07/15 21:22:49 UTC

[jira] [Commented] (TEZ-287) Pass terminationCause information back to client-side (access via DAGClient.ApplicationReport)

    [ https://issues.apache.org/jira/browse/TEZ-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708850#comment-13708850 ] 

Mike Liddell commented on TEZ-287:
----------------------------------

First step of better post-mortem diagnostics.  This just gets basic DAGTerminationCause and VertexTerminationCause back to client-side.  Note this info is only available while AppMaster is still running and so it is limited in usefulness.

Another JIRA will be required to improve how much info is available via the RM before investing much further here.
                
> Pass terminationCause information back to client-side (access via DAGClient.ApplicationReport)
> ----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-287
>                 URL: https://issues.apache.org/jira/browse/TEZ-287
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Mike Liddell
>            Assignee: Mike Liddell
>         Attachments: TEZ-287.1.patch
>
>
> need to determine how much to send:
>  - dag termination cause
>  - vertex termination cause
>  - all tasks termination cause
> Perhaps logic such as 
>  - dont send task termination cause (too much volume with high noise ratio)
>  - however, any Vertex with VertexTerminationCause=OWN_TASK_FAILURE could failingTaskCount and a single failed-task-ID to aid post-mortem.  (it is probably interesting to user if few task of many failed vs all tasks failed.  when there are few failing tasks, it might be helpful to have an ID on hand to start the diagnosis)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira