You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Srikanth Sundarrajan (JIRA)" <ji...@apache.org> on 2015/09/25 08:11:04 UTC

[jira] [Commented] (OOZIE-2258) Introducing a new counter in the instrumentation log to distinguish between the reasons for launcher failure

    [ https://issues.apache.org/jira/browse/OOZIE-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907653#comment-14907653 ] 

Srikanth Sundarrajan commented on OOZIE-2258:
---------------------------------------------

{code}
@@ -1411,12 +1414,20 @@ public class JavaActionExecutor extends ActionExecutor {
                             if (exMsg != null) {
                                 LOG.warn("Launcher exception: {0}{E}{1}", exMsg, exStackTrace);
                             }
+                            else {
+                                childJobKill = true;
+                            }
{code}

Not sure if this is in the right place. Possible to add a test.

A more fundamental question. How do we intend to use this ? 

> Introducing a new counter in the instrumentation log to distinguish between the reasons for launcher failure
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2258
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Narayan Periwal
>            Assignee: Narayan Periwal
>         Attachments: OOZIE-2258-v0.patch, OOZIE-2258-v1.patch
>
>
> Whether the launcher job fails due to child job failure or exception in the launcher job itself, in both the case, the "counters:jobs:killed" counter is updated in the instrumentation log. Hence, we cannot distinguish whether the launcher failure was due to child job getting failed or not. So, we can introduce a new counter "kill" under the group "childjobs" that will help us to distinguish if the launcher failure is due to the child jobs getting failed.
> Let me know if there is already any other way by which we can distinguish this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)