You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Purshotam Shah (JIRA)" <ji...@apache.org> on 2014/05/29 03:12:02 UTC

[jira] [Commented] (OOZIE-1778) Rollback option for XCommand

    [ https://issues.apache.org/jira/browse/OOZIE-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011926#comment-14011926 ] 

Purshotam Shah commented on OOZIE-1778:
---------------------------------------

{quote}
wouldn't StatusTransitService take care of this scenario since the bundle's pending is set? There's ActionCheckerService running periodically for this purpose too. We should enhance it for bundle-action recovery. currently it only does wf-action and coord-action checks
{quote}

I don't think StatusTransitService updates bundle action pending. It resets the bundle pending based on bundle actions pending status.

ActionCheckerService check the status of running Hadoop jobs and update action/wf status. Use-case is different here. 
Why to wait for 5 min and update/rollback as part of service, if we can do same thing as part of XCommand.

> Rollback option for XCommand
> ----------------------------
>
>                 Key: OOZIE-1778
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1778
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>
> Currently if we issue a command at bundle level, which set the pending for bundle action and issue child command.
> If child command succeed, then it's all good. But if child command at pre-check or acquiring lock fails, then there is no way to update parent.
> In this scenario, bundle action and remain in pending and will cause unexpected behavior.
> We should do something like 
> {code:java}
> XCommand.call() throws CommandException {
>         try {
>             eagerVerifyPrecondition();
>             acquireLockCron.start();
>             acquireLock();
>             acquireLockCron.stop();
>             loadState();
>             verifyPrecondition();
>             ret = execute();
>         }
>         catch(Throwable e){
> 		handleFailure();
>         }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)