You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by András Piros <an...@cloudera.com> on 2017/06/29 16:35:23 UTC

Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/ErrorCode.java 662e1edc9c4b23b3606c751bf5ed4b531ee7ac62 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DBOperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/MySQLRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/NestedRetryUtil.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestDBOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/1/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the {{MiniOozieTestCase}} framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestDBOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.

> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/service/JPAService.java
> > Lines 267 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767096#file1767096line270>
> >
> >     Do we need this? Usually this is done only once, when Oozie starts. If there's a problem during startup IMO it's good to know about it and fail.
> >     
> >     Let's see other people's opinion.

For the unit tests' sake, and when there is a DB outage just in the moment when Oozie server restart / is redeployed, it's a must. The usual retry policies and logging apply for those cases as well - we surely will know when there is a temporary problem the retry logic can solve, or a permanent one.


> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java
> > Lines 43 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767102#file1767102line43>
> >
> >     Can't we move this under src/test ?

`FailingMySQLDriverWrapper` uses it, and it is needed for on-the-fly failure injection runtime.


> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/NestedRetryUtil.java
> > Lines 26 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767106#file1767106line26>
> >
> >     Let's add comments here and explain *properly* what this class does & why.

Renamed class and methods, as well as created test class that covers the functionality. Hope that helps.


> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java
> > Lines 29 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767107#file1767107line29>
> >
> >     Does this work properly? The exception that we receive is always JPAExecutorException so these exceptions has to be extracted from it. I don't see any logic (or calls to getAllExceptions()) to do that.

Reimplemented and created new test cases to be sure the blacklisting works properly.


> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java
> > Lines 28 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767108#file1767108line28>
> >
> >     Another test code - can we move this to src/test ?

We need this here because we want to inject `PersistenceException` instances also at runtime, for functional and stress test purposes.


> On June 30, 2017, 2:27 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java
> > Lines 43 (patched)
> > <https://reviews.apache.org/r/60544/diff/1/?file=1767108#file1767108line43>
> >
> >     I don't think we need such a generic logic to throw exceptions or do we? For testing purposes, it's more than enough to throw a specific exception I guess.

It's reusable and we may end up adding more `Exception` subclasses for functional testing purposes, e.g. for another DB type.


- András


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179358
-----------------------------------------------------------


On June 29, 2017, 4:35 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated June 29, 2017, 4:35 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 662e1edc9c4b23b3606c751bf5ed4b531ee7ac62 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DBOperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/MySQLRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/NestedRetryUtil.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestDBOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/1/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the {{MiniOozieTestCase}} framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestDBOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179358
-----------------------------------------------------------




core/pom.xml
Lines 42 (patched)
<https://reviews.apache.org/r/60544/#comment254019>

    White space?



core/src/main/java/org/apache/oozie/ErrorCode.java
Lines 110 (patched)
<https://reviews.apache.org/r/60544/#comment254020>

    You can delete this. Originally this had a purporse but not anymore.



core/src/main/java/org/apache/oozie/service/JPAService.java
Lines 267 (patched)
<https://reviews.apache.org/r/60544/#comment254022>

    Do we need this? Usually this is done only once, when Oozie starts. If there's a problem during startup IMO it's good to know about it and fail.
    
    Let's see other people's opinion.



core/src/main/java/org/apache/oozie/util/db/DBOperationRetryHandler.java
Lines 44 (patched)
<https://reviews.apache.org/r/60544/#comment254034>

    This is not relevant anymore



core/src/main/java/org/apache/oozie/util/db/DBOperationRetryHandler.java
Lines 84 (patched)
<https://reviews.apache.org/r/60544/#comment254023>

    Not sure if we have to log this on ERROR level.



core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java
Lines 43 (patched)
<https://reviews.apache.org/r/60544/#comment254024>

    Can't we move this under src/test ?



core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java
Lines 157-161 (patched)
<https://reviews.apache.org/r/60544/#comment254026>

    It's cleaner if you add all these literals to sets then perform the startsWith() and contains() operations on the set elements in a for loop.



core/src/main/java/org/apache/oozie/util/db/MySQLRetryPredicate.java
Lines 27 (patched)
<https://reviews.apache.org/r/60544/#comment254035>

    Do we still need this class?



core/src/main/java/org/apache/oozie/util/db/NestedRetryUtil.java
Lines 26 (patched)
<https://reviews.apache.org/r/60544/#comment254033>

    Let's add comments here and explain *properly* what this class does & why.



core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java
Lines 29 (patched)
<https://reviews.apache.org/r/60544/#comment254028>

    Does this work properly? The exception that we receive is always JPAExecutorException so these exceptions has to be extracted from it. I don't see any logic (or calls to getAllExceptions()) to do that.



core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java
Lines 34-68 (patched)
<https://reviews.apache.org/r/60544/#comment254027>

    Create a Set<Class<?>> with all the classes mentioned here and process it accordingly in a loop to have smaller code.



core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java
Lines 28 (patched)
<https://reviews.apache.org/r/60544/#comment254029>

    Another test code - can we move this to src/test ?



core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java
Lines 43 (patched)
<https://reviews.apache.org/r/60544/#comment254030>

    I don't think we need such a generic logic to throw exceptions or do we? For testing purposes, it's more than enough to throw a specific exception I guess.



core/src/test/java/org/apache/oozie/test/XTestCase.java
Lines 872 (patched)
<https://reviews.apache.org/r/60544/#comment254031>

    Nice refactor:)



minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java
Lines 37 (patched)
<https://reviews.apache.org/r/60544/#comment254032>

    Could you add a short comment that describes what we test here exactly?


- Peter Bacsko


On jún. 29, 2017, 4:35 du, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated jún. 29, 2017, 4:35 du)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 662e1edc9c4b23b3606c751bf5ed4b531ee7ac62 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DBOperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/MySQLRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/NestedRetryUtil.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestDBOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/1/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the {{MiniOozieTestCase}} framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestDBOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.

> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/pom.xml
> > Lines 39 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768464#file1768464line39>
> >
> >     Do not add this dependency twice:
> >     https://github.com/apache/oozie/blob/378f294c2cd92ac0d505201857f03746ce2e58ac/core/pom.xml#L743

Setting it to `compile` scope because we need `FailingMySQLDriverWrapper` to work.


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/service/JPAService.java
> > Line 413 (original), 496 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line499>
> >
> >     Use ``&& !updateQueryList.isEmpty()`` instead of size() >0

Using `CollectionUtils.isNotEmpty()`.


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java
> > Lines 71 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768468#file1768468line71>
> >
> >     This a new named query that it is only used in ``XTestCase``, and there is no check is performed on it in any tests; it is is used in ``cleanUpDBTablesInternal()``. In other words, no REST API requests are using it for any tasks.
> >     
> >     As I see there are other named queries that are not used for meaningful purposes (i.e. they are only used in tests). It would be nice to go through all named queries and eliminate the "dead" ones. For example "GET_ACTIONS" is only used in tests.

Good idea, there is a lot more to do when speaking of constans / named queries / other JPA primitives. Will fire up a separate JIRA.


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java
> > Lines 41 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768470#file1768470line41>
> >
> >     If driverClassName is null, we can return earlier throwing a SQLNestedException with appropriate message. This way it will not be needed to check a second time driverClassName is null.

I would keep the fix in line w/ [the original DBCP one](https://github.com/apache/commons-dbcp/blob/DBCP_1_4_x_BRANCH/src/java/org/apache/commons/dbcp/BasicDataSource.java#L1588-L1660).


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java
> > Lines 42 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768470#file1768470line42>
> >
> >     1 try with multiple catch block should be enough

I would keep the fix in line w/ [the original DBCP one](https://github.com/apache/commons-dbcp/blob/DBCP_1_4_x_BRANCH/src/java/org/apache/commons/dbcp/BasicDataSource.java#L1588-L1660).


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java
> > Lines 347 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768472#file1768472line347>
> >
> >     nit: OozieDBCLI also uses the tablenames. We could extract these into a central place perhaps in a separate Jira?

Good idea, there is a lot more to do when speaking of constans / named queries / other JPA primitives. Will fire up a separate JIRA.


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
> > Lines 31 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768475#file1768475line31>
> >
> >     No need for this annotation; if  RETRY_ATTEMPT_COUNTER is package private, then TestOperationRetryHandler will access it. Or you can restrict visibility of RETRY_ATTEMPT_COUNTER to private.

Excerpt from the Javadoc of the class `com.google.common.annotations.VisibleForTesting`:
```
An annotation that indicates that the visibility of a type or member has been relaxed to make the code testable.
```

So the field `OperationRetryHandler.NESTED_RETRY_ATTEMPT_COUNTER` would be private, but for sake of testability its visibility is relaxed to package protected - hence the annotation.


> On July 4, 2017, 1:16 p.m., Attila Sasvari wrote:
> > minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java
> > Lines 112 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768490#file1768490line114>
> >
> >     this is not necessary

From an IDE(A) perspective, it's good to see why your test actually breaks. And this `assumeFalse()` makes the root cause visible. From Maven, yes, `JAVA_HOME` can be considered as always set, so no issues there.

Without adding the additional workflow property having `JAVA_HOME` as value, the test will fail in an IDE(A):
```
junit.framework.AssertionFailedError: 
Expected :SUCCEEDED
Actual   :RUNNING
 <Click to see difference>


	at junit.framework.Assert.fail(Assert.java:57)
	at junit.framework.Assert.failNotEquals(Assert.java:329)
	at junit.framework.Assert.assertEquals(Assert.java:78)
	at junit.framework.Assert.assertEquals(Assert.java:86)
	at junit.framework.TestCase.assertEquals(TestCase.java:253)
	at org.apache.oozie.test.TestWorkflow.runWorkflowFromFile(TestWorkflow.java:167)
	at org.apache.oozie.test.TestWorkflow.testParallelFsAndShellWorkflowCompletesSuccessfully(TestWorkflow.java:117)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at junit.framework.TestCase.runTest(TestCase.java:176)
	at junit.framework.TestCase.runBare(TestCase.java:141)
	at junit.framework.TestResult$1.protect(TestResult.java:122)
	at junit.framework.TestResult.runProtected(TestResult.java:142)
	at junit.framework.TestResult.run(TestResult.java:125)
	at junit.framework.TestCase.run(TestCase.java:129)
	at junit.framework.TestSuite.runTest(TestSuite.java:255)
	at junit.framework.TestSuite.run(TestSuite.java:250)
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
```


- András


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179556
-----------------------------------------------------------


On July 3, 2017, 3:15 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 3:15 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Attila Sasvari <as...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179556
-----------------------------------------------------------



Please document the feature and its configuration in the twiki files.


core/pom.xml
Lines 39 (patched)
<https://reviews.apache.org/r/60544/#comment254321>

    Do not add this dependency twice:
    https://github.com/apache/oozie/blob/378f294c2cd92ac0d505201857f03746ce2e58ac/core/pom.xml#L743



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 273 (original), 330 (patched)
<https://reviews.apache.org/r/60544/#comment254347>

    - Extract method
    ```
    if (em.getTransaction().isActive()) {
    ...
      em.getTransaction().commit();
    }
    ```
    
    - Extract constant "org.apache.oozie.command.SkipCommitFaultInjection"



core/src/main/java/org/apache/oozie/service/JPAService.java
Lines 351 (patched)
<https://reviews.apache.org/r/60544/#comment254322>

    remove dead comnent



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 333 (original), 399 (patched)
<https://reviews.apache.org/r/60544/#comment254375>

    use ``commitTransaction(em)``



core/src/main/java/org/apache/oozie/service/JPAService.java
Lines 486 (patched)
<https://reviews.apache.org/r/60544/#comment254365>

    this comment is not necessary



core/src/main/java/org/apache/oozie/service/JPAService.java
Lines 490 (patched)
<https://reviews.apache.org/r/60544/#comment254366>

    I am not sure if ``em.getTransaction().isActive()`` will ever be evaluated true.
    
    However, if so and rollback is not successfull, we should return and increase actual retry count (rollback is a kind of retry attempt).



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 413 (original), 496 (patched)
<https://reviews.apache.org/r/60544/#comment254364>

    Use ``&& !updateQueryList.isEmpty()`` instead of size() >0



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 499 (original), 599 (patched)
<https://reviews.apache.org/r/60544/#comment254346>

    Log that no results were returned like in ``executeGet`` to keep consistency



core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java
Lines 71 (patched)
<https://reviews.apache.org/r/60544/#comment254369>

    This a new named query that it is only used in ``XTestCase``, and there is no check is performed on it in any tests; it is is used in ``cleanUpDBTablesInternal()``. In other words, no REST API requests are using it for any tasks.
    
    As I see there are other named queries that are not used for meaningful purposes (i.e. they are only used in tests). It would be nice to go through all named queries and eliminate the "dead" ones. For example "GET_ACTIONS" is only used in tests.



core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java
Lines 33 (patched)
<https://reviews.apache.org/r/60544/#comment254352>

    Could you add DBCP-333 to reveal what the known bug is?



core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java
Lines 41 (patched)
<https://reviews.apache.org/r/60544/#comment254350>

    If driverClassName is null, we can return earlier throwing a SQLNestedException with appropriate message. This way it will not be needed to check a second time driverClassName is null.



core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java
Lines 42 (patched)
<https://reviews.apache.org/r/60544/#comment254349>

    1 try with multiple catch block should be enough



core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java
Lines 347 (patched)
<https://reviews.apache.org/r/60544/#comment254360>

    nit: OozieDBCLI also uses the tablenames. We could extract these into a central place perhaps in a separate Jira?



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 31 (patched)
<https://reviews.apache.org/r/60544/#comment254329>

    No need for this annotation; if  RETRY_ATTEMPT_COUNTER is package private, then TestOperationRetryHandler will access it. Or you can restrict visibility of RETRY_ATTEMPT_COUNTER to private.



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 74 (patched)
<https://reviews.apache.org/r/60544/#comment254362>

    As a user, this error message would confuse me. 
    ``Retry attempts have been exhausted.`` / ``Database operation retry limit has been exceeded.`` would be more descriptive in my opinion.



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 86 (patched)
<https://reviews.apache.org/r/60544/#comment254363>

    nit: 
    "Exception is not on blacklist" or something would be better than "Exception does not apply"



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 101 (patched)
<https://reviews.apache.org/r/60544/#comment254382>

    nit: sleepBeforeRetry or something would be more expressive / better describe what this function does.



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 102 (patched)
<https://reviews.apache.org/r/60544/#comment254377>

    e is never used



core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java
Lines 24 (patched)
<https://reviews.apache.org/r/60544/#comment254344>

    Do not use wildcard import.



core/src/test/java/org/apache/oozie/test/XTestCase.java
Lines 872 (patched)
<https://reviews.apache.org/r/60544/#comment254323>

    What is the added value of ``E extends Object`` here? E already extends Object.



minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java
Lines 109 (patched)
<https://reviews.apache.org/r/60544/#comment254325>

    there is no need to divide by 1000 to determine if second is even



minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java
Lines 112 (patched)
<https://reviews.apache.org/r/60544/#comment254326>

    this is not necessary


- Attila Sasvari


On July 3, 2017, 3:15 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 3:15 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.

> On júl. 4, 2017, 1:02 du, Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
> > Lines 28 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768477#file1768477line28>
> >
> >     Let's rewrite this explanation and discuss f2f.

My idea:

"This class tracks nested OperationRetryHandler calls. Some JPAExecutor implementations call other JPAExecutors. This results in two (or possibly more) OperationRetryHandler.executeWithRetry() calls. If the innermost retry handler has exhausted all attempts and re-throws the exception, then the outer handler catches it and would re-start the JPA operation again. In order to avoid this, RetryHandlers must communicate with each other on the same thread by incrementing/decrementing the nesting level and signalling whether the maximum number of attempts have been reached. 

We use thread locals because RetryHandlers might be called from different threads in parallel. If the nesting level is 0, it's important to reset the "retryAttemptsExhausted" back to false since this variable is re-used in the thread pool."


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179570
-----------------------------------------------------------


On júl. 3, 2017, 3:15 du, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated júl. 3, 2017, 3:15 du)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.

> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/service/JPAService.java
> > Line 272 (original), 329 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line332>
> >
> >     I really don't like this code! It's been here for a long time, but it's very smelly. 
> >     
> >     Should we remove this? Could pls check if we have tests that take advantage of this fault injection>

Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property:
```
TestActionFailover
TestBatchQueryExecutor
TestBundleJobsDeleteJPAExecutor
TestCoordActionsDeleteJPAExecutor
TestCoordJobsDeleteJPAExecutor
TestWorkflowJobsDeleteJPAExecutor
TestSLACalculationJPAExecutor
```
so I'm letting it stay :-(


> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/service/JPAService.java
> > Line 334 (original), 400 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line403>
> >
> >     Check this injection usage

Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property:
```
TestActionFailover
TestBatchQueryExecutor
TestBundleJobsDeleteJPAExecutor
TestCoordActionsDeleteJPAExecutor
TestCoordJobsDeleteJPAExecutor
TestWorkflowJobsDeleteJPAExecutor
TestSLACalculationJPAExecutor
```
so I'm letting it stay :-(


> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/service/JPAService.java
> > Line 432 (original), 515 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line518>
> >
> >     Just like above - check if this is worth keeping

Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property:
```
TestActionFailover
TestBatchQueryExecutor
TestBundleJobsDeleteJPAExecutor
TestCoordActionsDeleteJPAExecutor
TestCoordJobsDeleteJPAExecutor
TestWorkflowJobsDeleteJPAExecutor
TestSLACalculationJPAExecutor
```
so I'm letting it stay :-(


> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
> > Lines 28 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768477#file1768477line28>
> >
> >     Let's rewrite this explanation and discuss f2f.
> 
> Peter Bacsko wrote:
>     My idea:
>     
>     "This class tracks nested OperationRetryHandler calls. Some JPAExecutor implementations call other JPAExecutors. This results in two (or possibly more) OperationRetryHandler.executeWithRetry() calls. If the innermost retry handler has exhausted all attempts and re-throws the exception, then the outer handler catches it and would re-start the JPA operation again. In order to avoid this, RetryHandlers must communicate with each other on the same thread by incrementing/decrementing the nesting level and signalling whether the maximum number of attempts have been reached. 
>     
>     We use thread locals because RetryHandlers might be called from different threads in parallel. If the nesting level is 0, it's important to reset the "retryAttemptsExhausted" back to false since this variable is re-used in the thread pool."

Thanks, incorporated the class javadocs.


> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote:
> > minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java
> > Lines 80 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768488#file1768488line80>
> >
> >     This assertion will not be evaluated by JUnit because it's called on a different thread. Instead modify a volatile boolean flag that something wasn't quite right.

Using a `private volatile Exception` here.


- András


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179570
-----------------------------------------------------------


On July 3, 2017, 3:15 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 3:15 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179570
-----------------------------------------------------------




core/src/main/java/org/apache/oozie/service/JPAService.java
Line 272 (original), 329 (patched)
<https://reviews.apache.org/r/60544/#comment254367>

    I really don't like this code! It's been here for a long time, but it's very smelly. 
    
    Should we remove this? Could pls check if we have tests that take advantage of this fault injection>



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 334 (original), 400 (patched)
<https://reviews.apache.org/r/60544/#comment254372>

    Check this injection usage



core/src/main/java/org/apache/oozie/service/JPAService.java
Line 432 (original), 515 (patched)
<https://reviews.apache.org/r/60544/#comment254368>

    Just like above - check if this is worth keeping



core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java
Lines 27 (patched)
<https://reviews.apache.org/r/60544/#comment254371>

    Could you add comments about this predicate class? Why we need it, when it applies, etc. 
    
    Something like:
    
    "A predicate which applies when a given exception (or its causes) are NOT blacklisted. Blacklisted exceptions in this class do not indicate a network failure, therefore no retry should take place"



core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
Lines 28 (patched)
<https://reviews.apache.org/r/60544/#comment254376>

    Let's rewrite this explanation and discuss f2f.



core/src/test/java/org/apache/oozie/test/XTestCase.java
Lines 872 (patched)
<https://reviews.apache.org/r/60544/#comment254370>

    Just <E> instead of <E extends Object>



minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java
Lines 54 (patched)
<https://reviews.apache.org/r/60544/#comment254373>

    If we modify the log4j setup, don't we have to restore it in teardown?



minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java
Lines 80 (patched)
<https://reviews.apache.org/r/60544/#comment254374>

    This assertion will not be evaluated by JUnit because it's called on a different thread. Instead modify a volatile boolean flag that something wasn't quite right.


- Peter Bacsko


On júl. 3, 2017, 3:15 du, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated júl. 3, 2017, 3:15 du)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179775
-----------------------------------------------------------




core/src/main/java/org/apache/oozie/service/JPAService.java
Line 27 (original), 28 (patched)
<https://reviews.apache.org/r/60544/#comment254607>

    Import



core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java
Lines 42-48 (patched)
<https://reviews.apache.org/r/60544/#comment254608>

    I think this part can be removed. The commens above should be enough.



core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java
Lines 26 (patched)
<https://reviews.apache.org/r/60544/#comment254610>

    Import



core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java
Lines 47-62 (patched)
<https://reviews.apache.org/r/60544/#comment254611>

    This test is very tricky... in this form, it doesn't test much - you change values on different threads and validate the values on the test's thread, where no state changes happen. 
    
    If you insist on validating the proper thread-local behavior, I suggest:
    
    1) Use only a single thread where you increment the nesting level plus set the exhausted flag
    2) In the test, you assert that inProgressCount = 0 and isExhausted = false.


- Peter Bacsko


On júl. 6, 2017, 9:13 de, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated júl. 6, 2017, 9:13 de)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
>   docs/src/site/twiki/AG_Install.twiki 2125442c273661d54f3566cbe41ead822c1f7baa 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/5/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179784
-----------------------------------------------------------




minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java
Lines 94 (patched)
<https://reviews.apache.org/r/60544/#comment254628>

    You have to wait until all calls to getResultListAndExecuteUpdateOnWorkflowBeans() complete. At this point the are possibly still running. I suggest using a CountDownLatch (which counts down from 10) for this purpose.


- Peter Bacsko


On júl. 6, 2017, 2:58 du, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated júl. 6, 2017, 2:58 du)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
>   docs/src/site/twiki/AG_Install.twiki 2125442c273661d54f3566cbe41ead822c1f7baa 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/6/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 6, 2017, 4:08 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

One more thing.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
  docs/src/site/twiki/AG_Install.twiki 2125442c273661d54f3566cbe41ead822c1f7baa 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/7/

Changes: https://reviews.apache.org/r/60544/diff/6-7/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 6, 2017, 2:58 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

One more round on review comments.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
  docs/src/site/twiki/AG_Install.twiki 2125442c273661d54f3566cbe41ead822c1f7baa 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/6/

Changes: https://reviews.apache.org/r/60544/diff/5-6/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 6, 2017, 9:13 a.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

Addressing remaining review comments, as well as documenting the feature.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
  docs/src/site/twiki/AG_Install.twiki 2125442c273661d54f3566cbe41ead822c1f7baa 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/5/

Changes: https://reviews.apache.org/r/60544/diff/4-5/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 6, 2017, 9:12 a.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/4/


Testing (updated)
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptState
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 5, 2017, 5:45 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

Addressing latest review comments. Still need to document in TWiki.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/4/

Changes: https://reviews.apache.org/r/60544/diff/3-4/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 5, 2017, 3:44 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

Addressing latest review comments.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java 158fbfafd2ed4ebc741c4d33252d92e5bb77bb63 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/3/

Changes: https://reviews.apache.org/r/60544/diff/2-3/


Testing
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.

> On July 4, 2017, 1:28 p.m., Peter Bacsko wrote:
> > core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
> > Lines 41 (patched)
> > <https://reviews.apache.org/r/60544/diff/2/?file=1768477#file1768477line41>
> >
> >     Let's think about the naming.. we don't just count with this class, we also store state information. If you were not happy with NestedRetryUti, we can call it NestedRetryHandler or anyhing.

Renamed to `RetryAttemptState`, and renamed most of the methods inside. I don't like the words `Util(s)` and `Helper(s)` in variable or class names.


- András


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179575
-----------------------------------------------------------


On July 5, 2017, 5:45 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated July 5, 2017, 5:45 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptState.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml 5629a897b4a1834554a1a5f7a21984ea857ed133 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/command/SkipCommitFaultInjection.java  
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptState.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/4/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by Peter Bacsko <pb...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/#review179575
-----------------------------------------------------------




core/src/main/java/org/apache/oozie/service/JPAService.java
Lines 267 (patched)
<https://reviews.apache.org/r/60544/#comment254383>

    Class<?> instead of Class



core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java
Lines 32 (patched)
<https://reviews.apache.org/r/60544/#comment254384>

    The name RETRY_ATTEMPT_COUNTER is misleading. It actually counts the number of nested retry calls.



core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
Lines 41 (patched)
<https://reviews.apache.org/r/60544/#comment254385>

    Let's think about the naming.. we don't just count with this class, we also store state information. If you were not happy with NestedRetryUti, we can call it NestedRetryHandler or anyhing.



core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java
Lines 62 (patched)
<https://reviews.apache.org/r/60544/#comment254386>

    No need to use synchronized


- Peter Bacsko


On júl. 3, 2017, 3:15 du, András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60544/
> -----------------------------------------------------------
> 
> (Updated júl. 3, 2017, 3:15 du)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2854
> 
> 
> Diffs
> -----
> 
>   core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
>   core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
>   core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
>   core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
>   core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
>   core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
>   core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
>   minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
>   minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
>   minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
>   minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
>   minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
>   minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
>   minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
>   pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 
> 
> 
> Diff: https://reviews.apache.org/r/60544/diff/2/
> 
> 
> Testing
> -------
> 
> Tests covered in code:
> 
> Unit tests
> ==========
> 
> * testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures
> 
> Integration tests
> =================
> 
> * using the `MiniOozieTestCase` framework
> * fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
> * following workflow scenarios:
> * a very simple one consisting only of a `<start/>` and an `<end/>` node
> * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
> * the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes
> 
> Test cases run:
> ```
> mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
> ```
> 
> Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.
> 
> 
> Thanks,
> 
> András Piros
> 
>


Re: Review Request 60544: OOZIE-2854 Oozie should handle transient database problems

Posted by András Piros <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60544/
-----------------------------------------------------------

(Updated July 3, 2017, 3:15 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko.


Changes
-------

Updated diff to reflect to review comments.


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2854


Diffs (updated)
-----

  core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 
  core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java 8d94c23e40d1281864db40e141b200ca207a6324 
  core/src/main/java/org/apache/oozie/service/JPAService.java 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 
  core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 
  core/src/main/java/org/apache/oozie/store/WorkflowStore.java c565e74893b863caef6c93015cfe38fe520d04ec 
  core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java PRE-CREATION 
  core/src/main/resources/META-INF/persistence.xml bad9278597fcd4f93b4cc482afae8af14beaa922 
  core/src/main/resources/oozie-default.xml c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
  core/src/main/resources/oozie-log4j.properties c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 161927ac8f1132b3080d2924844826fcc7b807a5 
  core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java PRE-CREATION 
  minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc 
  minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java PRE-CREATION 
  minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd 
  minitest/src/test/resources/hsqldb-oozie-site.xml fa5fe9c3185e973e8247d7bf10b126119d9c02c9 
  minitest/src/test/resources/oozie-log4j.properties c142d725140930bfa89cd2b163d0768a4c3a750a 
  minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION 
  minitest/src/test/resources/wf-test.xml 20c4946862039a65c76ed7f49991345e90a694de 
  pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 


Diff: https://reviews.apache.org/r/60544/diff/2/

Changes: https://reviews.apache.org/r/60544/diff/1-2/


Testing (updated)
-------

Tests covered in code:

Unit tests
==========

* testing the retry handler, the retry predicate filter, and parallel calls to JPA `EntityManager` (mostly Oozie database reads and writes) when injecting failures

Integration tests
=================

* using the `MiniOozieTestCase` framework
* fixing it so that also asynchronous workflow applications (the ones that use `CallableQueueService`) can be run
* following workflow scenarios:
* a very simple one consisting only of a `<start/>` and an `<end/>` node
* a more sophisticated one consisting of multiple synchronous `<fs/>` nodes and a `<decision/>` node
* the ultimate one consisting of a `<decision/>` node, and two branches of an `<fs/>` and an asynchronous `<shell/>` nodes

Test cases run:
```
mvn clean test -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter
```

Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon has been stopped / killed / restarted several times. Also firewall rules have been modified temporarily to simulate network outages.


Thanks,

András Piros