You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Mona Chitnis (JIRA)" <ji...@apache.org> on 2014/07/16 21:18:08 UTC

[jira] [Created] (OOZIE-1938) Fork-join job does not execute join node sometimes during HA failover

Mona Chitnis created OOZIE-1938:
-----------------------------------

             Summary: Fork-join job does not execute join node sometimes during HA failover
                 Key: OOZIE-1938
                 URL: https://issues.apache.org/jira/browse/OOZIE-1938
             Project: Oozie
          Issue Type: Bug
          Components: HA
    Affects Versions: trunk
            Reporter: Mona Chitnis
             Fix For: trunk


Reported by Michelle Chiang (Yahoo Oozie QE)

Scenario: (2 Oozie HA servers)
21:38:56 submit job at oozie client
21:41:42 shut down server1
21:46:52 shut down server2
21:47:30 start server1
22:15:05 start server2

the last fork path end time is 21:52:53.
22:36:48 the job is still RUNNING, not moving to join node.

Digging into the logs, the locking part seems to work fine with forked action processing distributed amongst the two servers when both running or when one of them is down. The issue seems to be why even RecoveryService fails to pick up the job after all the forks had completed



--
This message was sent by Atlassian JIRA
(v6.2#6252)