You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Purshotam Shah (JIRA)" <ji...@apache.org> on 2014/05/30 23:13:02 UTC

[jira] [Updated] (OOZIE-1864) Improve chid job id aggregation logic

     [ https://issues.apache.org/jira/browse/OOZIE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Purshotam Shah updated OOZIE-1864:
----------------------------------

    Description: 
Current chid job id aggregation logic
Once launcher job complete submitting child job (jobs in case on pig), it
writes jobID to file.

>From Oozie server side, we collect childID in two ways
1. As soon as we submit launcher jobs, we check if launcher job terminated or
not. If it's terminated, we read child-id from file and populated to DB.  And
once kill command is issued we kill all child jobs.
2. We have a timer task (ActionCheckerService) which keeps on checking the
status of all running actions and if launcher job is terminated, it's update
the DB with childIDs.

Jobend notification is rejected if action is not running.  

Assume that launcher is killed after it has submitted child job.
Child job will never be killed.


To fix this, we should do following things.

1. If oozie receives job end notification and if launcher job is killed, collect
all child job and kill them if they are not killed.

2. Have a better way logic to collect child job id. Launcher job can call callbackServlet ( may be periodically) to
update child job ids. This could be useful in pig jobs. In current scenario we
report child jobs job only when launcher job completes.

> Improve chid job id aggregation logic
> -------------------------------------
>
>                 Key: OOZIE-1864
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1864
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>
> Current chid job id aggregation logic
> Once launcher job complete submitting child job (jobs in case on pig), it
> writes jobID to file.
> From Oozie server side, we collect childID in two ways
> 1. As soon as we submit launcher jobs, we check if launcher job terminated or
> not. If it's terminated, we read child-id from file and populated to DB.  And
> once kill command is issued we kill all child jobs.
> 2. We have a timer task (ActionCheckerService) which keeps on checking the
> status of all running actions and if launcher job is terminated, it's update
> the DB with childIDs.
> Jobend notification is rejected if action is not running.  
> Assume that launcher is killed after it has submitted child job.
> Child job will never be killed.
> To fix this, we should do following things.
> 1. If oozie receives job end notification and if launcher job is killed, collect
> all child job and kill them if they are not killed.
> 2. Have a better way logic to collect child job id. Launcher job can call callbackServlet ( may be periodically) to
> update child job ids. This could be useful in pig jobs. In current scenario we
> report child jobs job only when launcher job completes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)