You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Shwetha G S (JIRA)" <ji...@apache.org> on 2014/01/16 11:09:21 UTC

[jira] [Commented] (OOZIE-885) A race condition can cause the workflow/coordinator to run even after the bundle job is killed

    [ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873217#comment-13873217 ] 

Shwetha G S commented on OOZIE-885:
-----------------------------------

[~virag], wondering why this was added in RecoveryService.runBundleRecovery()
{noformat}
                    if (baction.getCoordId() == null) {
                        log.error("CoordId is null for Bundle action " + baction.getBundleActionId());
                        continue;
                    }
{noformat}
In BundleStartXCommand(), a row is created in bundle action(coord id is null) and queues CoordSubmitXCommand. If CoordSubmitXCommand is lost, recovery service should pick this bundle action and queue CoordSubmitXCommand. But this if condition exits if coord id is null. How does recovery on bundle action work with this? What am I missing here? 

> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-885
>                 URL: https://issues.apache.org/jira/browse/OOZIE-885
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>             Fix For: 3.3.0
>
>         Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)