You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Virag Kothari (JIRA)" <ji...@apache.org> on 2012/06/23 01:49:42 UTC
[jira] [Created] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Virag Kothari created OOZIE-885:
-----------------------------------
Summary: A race condition can cause the workflow/coordinator to run even after the bundle job is killed
Key: OOZIE-885
URL: https://issues.apache.org/jira/browse/OOZIE-885
Project: Oozie
Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Fix For: trunk, 3.2.1
Steps to reproduce:
1) Start the bundle job with a bunch of coordinators
2) Immediately kill it
Observation:
Some coordinators still keep on running
Reason:
Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Mohammad Kamrul Islam (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406181#comment-13406181 ]
Mohammad Kamrul Islam commented on OOZIE-885:
---------------------------------------------
Please revisit new try {}catch block added in RecoveryService class.
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Virag Kothari (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421058#comment-13421058 ]
Virag Kothari commented on OOZIE-885:
-------------------------------------
Its already fixed at https://issues.apache.org/jira/browse/OOZIE-904. The test cases in trunk shouldn't fail
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Virag Kothari (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Virag Kothari updated OOZIE-885:
--------------------------------
Attachment: OOZIE-885-v2.patch
try/catch structure changed in wf, coord and bundle recovery
Comment corrected in bundlestatusupdate
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Mohammad Kamrul Islam (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406721#comment-13406721 ]
Mohammad Kamrul Islam commented on OOZIE-885:
---------------------------------------------
+1
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Robert Kanter (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Kanter resolved OOZIE-885.
---------------------------------
Resolution: Fixed
Sorry about that; git hadn't pulled all of the latest changes for some reason.
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Robert Kanter (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Kanter reopened OOZIE-885:
---------------------------------
It looks like this patch broke a test in TestBundleStartXCommand:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.oozie.command.bundle.TestBundleStartXCommand
-------------------------------------------------------------------------------
Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 153.229 sec <<< FAILURE!
testBundleStartNegative2(org.apache.oozie.command.bundle.TestBundleStartXCommand) Time elapsed: 0.004 sec <<< FAILURE!
junit.framework.AssertionFailedError: expected:<FAILED> but was:<RUNNING>
at junit.framework.Assert.fail(Assert.java:50)
at junit.framework.Assert.failNotEquals(Assert.java:287)
at junit.framework.Assert.assertEquals(Assert.java:67)
at junit.framework.Assert.assertEquals(Assert.java:74)
at org.apache.oozie.command.bundle.TestBundleStartXCommand.testBundleStartNegative2(TestBundleStartXCommand.java:220)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at org.apache.maven.surefire.junitcore.ClassDemarcatingRunner.run(ClassDemarcatingRunner.java:58)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:24)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
{code}
It runs successfully on the commit before this one in trunk.
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OOZIE-885) A race condition can cause the
workflow/coordinator to run even after the bundle job is killed
Posted by "Virag Kothari (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Virag Kothari updated OOZIE-885:
--------------------------------
Attachment: OOZIE-885.patch
BundleStatusCommand modified to account for when coord-id is null.
Recovery service modified to correctly log the error message.
Testing:
Patch verified by Y! qe
> A race condition can cause the workflow/coordinator to run even after the bundle job is killed
> ----------------------------------------------------------------------------------------------
>
> Key: OOZIE-885
> URL: https://issues.apache.org/jira/browse/OOZIE-885
> Project: Oozie
> Issue Type: Bug
> Reporter: Virag Kothari
> Assignee: Virag Kothari
> Fix For: trunk, 3.2.1
>
> Attachments: OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira