You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2014/05/31 00:38:02 UTC

[jira] [Commented] (OOZIE-1864) Improve chid job id aggregation logic

    [ https://issues.apache.org/jira/browse/OOZIE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014311#comment-14014311 ] 

Robert Kanter commented on OOZIE-1864:
--------------------------------------

Another way to get the child jobs is to use the YARN tags stuff added in OOZIE-1722.  The Oozie server can simply ask YARN for the jobs with the tag that the Oozie server already knows.  Though this only works for Hadoop 2.4.0+

> Improve chid job id aggregation logic
> -------------------------------------
>
>                 Key: OOZIE-1864
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1864
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>
> Improve chid job id aggregation logic
> Current chid job id aggregation logic
> Once launcher job complete submitting child job (jobs in case on pig), it writes jobID to file.
> From Oozie server side, we collect childID in two ways
> 1. As soon as we submit launcher jobs, we check if launcher job terminated or not. If it's terminated, we read child-id from file and populated to DB.  And once kill command is issued we kill all child jobs.
> 2. We have a timer task (ActionCheckerService) which keeps on checking the status of all running actions and if launcher job is terminated, it's update the DB with childIDs.
> Jobend notification is rejected if action is not running.  
> Assume that launcher is killed after it has submitted child job.
> Child job will never be killed.
> To fix this, we should do following things.
> 1. If oozie receives job end notification and if launcher job is killed, collect all child job and kill them if they are not killed.
> 2. Have a better way logic to collect child job id. Launcher job can call callbackServlet ( may be periodically) to update child job ids. This could be useful in pig jobs. In current  scenario we report child jobs job only when launcher job completes.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)