You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Virag Kothari (JIRA)" <ji...@apache.org> on 2012/09/07 21:57:07 UTC

[jira] [Created] (OOZIE-989) Testcases failing intermittently where coordinator jobs are in catchup mode

Virag Kothari created OOZIE-989:
-----------------------------------

             Summary: Testcases failing intermittently where coordinator jobs are in catchup mode
                 Key: OOZIE-989
                 URL: https://issues.apache.org/jira/browse/OOZIE-989
             Project: Oozie
          Issue Type: Bug
            Reporter: Virag Kothari


When the coordinator jobs are in catchup mode, the CoordTriggerService may pick those jobs and start materializing new actions for it. This may cause conflict with the test case which is forcing a action to be added.

For. eg., in most of the test cases, there is something like below where a coordinator job and coordinator action are added.
{code}
int actionNum = 1;
        CoordinatorJobBean job = addRecordToCoordJobTable(CoordinatorJob.Status.RUNNING, false, false);
        CoordinatorActionBean action = addRecordToCoordActionTable(job.getId(), actionNum, CoordinatorAction.Status.WAITING, "coord-action-get.xml", 0);
{code}

The materializationtrigger service may pick the RUNNING coord job and start adding actions for it. This will cause the 'addRecordToCoordActionTable' to fail as the action is already inserted in DB.

Below are some links where this is happening:

https://builds.apache.org/job/oozie-trunk-precommit-build/71/testReport/junit/org.apache.oozie.executor.jpa/TestCoordActionsPendingFalseStatusCountGetJPAExecutor/testCoordActionPendingFalseStatusCountGet/

https://builds.apache.org/job/oozie-trunk-precommit-build/86/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetActionsJPAExecutor/testCoordActionGet/

https://builds.apache.org/job/oozie-trunk-precommit-build/85/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetReadyActionsJPAExecutor/testCoordActionGet/

Also, most of the log information for this failing test cases is lost as LocalOozie is used to start services. LocalOozie should not be used in test cases unless required.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OOZIE-989) Testcases failing intermittently where coordinator jobs are in catchup mode

Posted by "Robert Kanter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452415#comment-13452415 ] 

Robert Kanter commented on OOZIE-989:
-------------------------------------

The first option sounds easier to do, but the second option sounds like a better fix.  Though with the second option, we'd need to make sure that any new tests or modified tests use that new method.  
                
> Testcases failing intermittently where coordinator jobs are in catchup mode
> ---------------------------------------------------------------------------
>
>                 Key: OOZIE-989
>                 URL: https://issues.apache.org/jira/browse/OOZIE-989
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Virag Kothari
>
> When the coordinator jobs are in catchup mode, the CoordTriggerService may pick those jobs and start materializing new actions for it. This may cause conflict with the test case which is forcing a action to be added.
> For. eg., in most of the test cases, there is something like below where a coordinator job and coordinator action are added.
> {code}
> int actionNum = 1;
>         CoordinatorJobBean job = addRecordToCoordJobTable(CoordinatorJob.Status.RUNNING, false, false);
>         CoordinatorActionBean action = addRecordToCoordActionTable(job.getId(), actionNum, CoordinatorAction.Status.WAITING, "coord-action-get.xml", 0);
> {code}
> The materializationtrigger service may pick the RUNNING coord job and start adding actions for it. This will cause the 'addRecordToCoordActionTable' to fail as the action is already inserted in DB.
> Below are some links where this is happening:
> https://builds.apache.org/job/oozie-trunk-precommit-build/71/testReport/junit/org.apache.oozie.executor.jpa/TestCoordActionsPendingFalseStatusCountGetJPAExecutor/testCoordActionPendingFalseStatusCountGet/
> https://builds.apache.org/job/oozie-trunk-precommit-build/86/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetActionsJPAExecutor/testCoordActionGet/
> https://builds.apache.org/job/oozie-trunk-precommit-build/85/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetReadyActionsJPAExecutor/testCoordActionGet/
> Also, most of the log information for this failing test cases is lost as LocalOozie is used to start services. LocalOozie should not be used in test cases unless required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OOZIE-989) Testcases failing intermittently where coordinator jobs are in catchup mode

Posted by "Virag Kothari (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450930#comment-13450930 ] 

Virag Kothari commented on OOZIE-989:
-------------------------------------

There are couple of ways to solve the problem:

1) Exclude CoordMaterializeTriggerService from all test cases where coord job and coord action are added

2) Modify the testcases in such a way that coordinator jobs are created in future, so materialization trigger service doesn't pick those jobs. This can be done having a method in XDataTestCase like 'addRecordToCoordJobTableInFuture' which sets the start time in future.

2nd option may be better as all the services can be kept running



                
> Testcases failing intermittently where coordinator jobs are in catchup mode
> ---------------------------------------------------------------------------
>
>                 Key: OOZIE-989
>                 URL: https://issues.apache.org/jira/browse/OOZIE-989
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Virag Kothari
>
> When the coordinator jobs are in catchup mode, the CoordTriggerService may pick those jobs and start materializing new actions for it. This may cause conflict with the test case which is forcing a action to be added.
> For. eg., in most of the test cases, there is something like below where a coordinator job and coordinator action are added.
> {code}
> int actionNum = 1;
>         CoordinatorJobBean job = addRecordToCoordJobTable(CoordinatorJob.Status.RUNNING, false, false);
>         CoordinatorActionBean action = addRecordToCoordActionTable(job.getId(), actionNum, CoordinatorAction.Status.WAITING, "coord-action-get.xml", 0);
> {code}
> The materializationtrigger service may pick the RUNNING coord job and start adding actions for it. This will cause the 'addRecordToCoordActionTable' to fail as the action is already inserted in DB.
> Below are some links where this is happening:
> https://builds.apache.org/job/oozie-trunk-precommit-build/71/testReport/junit/org.apache.oozie.executor.jpa/TestCoordActionsPendingFalseStatusCountGetJPAExecutor/testCoordActionPendingFalseStatusCountGet/
> https://builds.apache.org/job/oozie-trunk-precommit-build/86/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetActionsJPAExecutor/testCoordActionGet/
> https://builds.apache.org/job/oozie-trunk-precommit-build/85/testReport/junit/org.apache.oozie.executor.jpa/TestCoordJobGetReadyActionsJPAExecutor/testCoordActionGet/
> Also, most of the log information for this failing test cases is lost as LocalOozie is used to start services. LocalOozie should not be used in test cases unless required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira