You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry <am...@hortonworks.com> on 2018/01/24 23:54:32 UTC

Review Request 65327: Fix: Repository Unit Tests: Occasional Seemingly Random Failures

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65327/
-----------------------------------------------------------

Review request for atlas, Apoorv Naik and Madhan Neethiraj.


Bugs: ATLAS-2378
    https://issues.apache.org/jira/browse/ATLAS-2378


Repository: atlas


Description
-------

**Approach for Analysis**
- Running tests for extended periods did not yield a predictable pattern.
- Exposing exceptions in the setup process helped me progress more quickly.
 
**Analysis & Findings**
- Spurious errors occur during test execution due to Solr, ZK and connectivity between them.
- _AtlasJanusGraphDatabase.cleanup_ occasionally fail due to some exception with BerkeleyDB shutdown. In short, our clean-up of the temporary folders within data directory are seldom succeed.
- TestNG invoking the tests cause occasional problem during setup.

**Fixes/Updates/Refactoring**
- Added guard conditions to AtlasJanusGraphDatabase.cleanup, this reduced exceptions.
- Expose exceptions (as opposed to swallowing them). This will give us a good starting point to analyze.
- Added TestNG’s _SkipException_ where we encounter setup failures. That will avoid showing incorrect test results and we will be able to proceed with builds.
- Got rid of DBSandboxer class as it caused more problems with DB cleanup.

**Possible Further improvement**
- Streamline the clean-up in a single class.


Diffs
-----

  graphdb/common/src/test/java/org/apache/atlas/graph/GraphSandboxUtil.java b8a9a49a 
  graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphDatabase.java f91226b8 
  pom.xml 5c4b4a7b 
  repository/pom.xml bb4d1eb4 
  repository/src/test/java/org/apache/atlas/DBSandboxer.java f4f099a6 
  repository/src/test/java/org/apache/atlas/TestModules.java df299cef 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java c44eea3b 
  repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTestUtils.java 72895125 
  repository/src/test/java/org/apache/atlas/services/MetricsServiceTest.java ca05cbeb 


Diff: https://reviews.apache.org/r/65327/diff/1/


Testing
-------

**Build**
Used the following command line for repeated execution. Executed between 2 to 4 hours and monitoring errors.
```
while sleep 3; do mvn -f repository/pom.xml clean install | grep -v "INFO\|WARNING\|Running"; done
```


Thanks,

Ashutosh Mestry


Re: Review Request 65327: Fix: Repository Unit Tests: Occasional Seemingly Random Failures

Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65327/#review196186
-----------------------------------------------------------




repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java
Lines 97 (patched)
<https://reviews.apache.org/r/65327/#comment275731>

    this waitOrBailout looks incorrect - for failed dsl query it sleeps and moves on to next query, shouldn't it retry the same query again after sleep? Or if it attempts next query why sleep is needed? Please review.



repository/src/test/java/org/apache/atlas/services/MetricsServiceTest.java
Lines 72 (patched)
<https://reviews.apache.org/r/65327/#comment275729>

    line 72 is same as 71, delete duplicate


- Sarath Subramanian


On Jan. 24, 2018, 3:54 p.m., Ashutosh Mestry wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65327/
> -----------------------------------------------------------
> 
> (Updated Jan. 24, 2018, 3:54 p.m.)
> 
> 
> Review request for atlas, Apoorv Naik and Madhan Neethiraj.
> 
> 
> Bugs: ATLAS-2378
>     https://issues.apache.org/jira/browse/ATLAS-2378
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> **Approach for Analysis**
> - Running tests for extended periods did not yield a predictable pattern.
> - Exposing exceptions in the setup process helped me progress more quickly.
>  
> **Analysis & Findings**
> - Spurious errors occur during test execution due to Solr, ZK and connectivity between them.
> - _AtlasJanusGraphDatabase.cleanup_ occasionally fail due to some exception with BerkeleyDB shutdown. In short, our clean-up of the temporary folders within data directory are seldom succeed.
> - TestNG invoking the tests cause occasional problem during setup.
> 
> **Fixes/Updates/Refactoring**
> - Added guard conditions to AtlasJanusGraphDatabase.cleanup, this reduced exceptions.
> - Expose exceptions (as opposed to swallowing them). This will give us a good starting point to analyze.
> - Added TestNG’s _SkipException_ where we encounter setup failures. That will avoid showing incorrect test results and we will be able to proceed with builds.
> - Got rid of DBSandboxer class as it caused more problems with DB cleanup.
> 
> **Possible Further improvement**
> - Streamline the clean-up in a single class.
> 
> 
> Diffs
> -----
> 
>   graphdb/common/src/test/java/org/apache/atlas/graph/GraphSandboxUtil.java b8a9a49a 
>   graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphDatabase.java f91226b8 
>   pom.xml 5c4b4a7b 
>   repository/pom.xml bb4d1eb4 
>   repository/src/test/java/org/apache/atlas/DBSandboxer.java f4f099a6 
>   repository/src/test/java/org/apache/atlas/TestModules.java df299cef 
>   repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java c44eea3b 
>   repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTestUtils.java 72895125 
>   repository/src/test/java/org/apache/atlas/services/MetricsServiceTest.java ca05cbeb 
> 
> 
> Diff: https://reviews.apache.org/r/65327/diff/1/
> 
> 
> Testing
> -------
> 
> **Build**
> Used the following command line for repeated execution. Executed between 2 to 4 hours and monitoring errors.
> ```
> while sleep 3; do mvn -f repository/pom.xml clean install | grep -v "INFO\|WARNING\|Running"; done
> ```
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>


Re: Review Request 65327: Fix: Repository Unit Tests: Occasional Seemingly Random Failures

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65327/#review196356
-----------------------------------------------------------


Ship it!




Ship It!

- Madhan Neethiraj


On Jan. 26, 2018, 1:18 a.m., Ashutosh Mestry wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65327/
> -----------------------------------------------------------
> 
> (Updated Jan. 26, 2018, 1:18 a.m.)
> 
> 
> Review request for atlas, Apoorv Naik and Madhan Neethiraj.
> 
> 
> Bugs: ATLAS-2378
>     https://issues.apache.org/jira/browse/ATLAS-2378
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> **Approach for Analysis**
> - Running tests for extended periods did not yield a predictable pattern.
> - Exposing exceptions in the setup process helped me progress more quickly.
>  
> **Analysis & Findings**
> - Spurious errors occur during test execution due to Solr, ZK and connectivity between them.
> - _AtlasJanusGraphDatabase.cleanup_ occasionally fail due to some exception with BerkeleyDB shutdown. In short, our clean-up of the temporary folders within data directory are seldom succeed.
> - TestNG invoking the tests cause occasional problem during setup.
> 
> **Fixes/Updates/Refactoring**
> - Added guard conditions to AtlasJanusGraphDatabase.cleanup, this reduced exceptions.
> - Expose exceptions (as opposed to swallowing them). This will give us a good starting point to analyze.
> - Added TestNG’s _SkipException_ where we encounter setup failures. That will avoid showing incorrect test results and we will be able to proceed with builds.
> - Got rid of DBSandboxer class as it caused more problems with DB cleanup.
> 
> **Possible Further improvement**
> - Streamline the clean-up in a single class.
> 
> 
> Diffs
> -----
> 
>   graphdb/common/src/test/java/org/apache/atlas/graph/GraphSandboxUtil.java b8a9a49a 
>   graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphDatabase.java f91226b8 
>   pom.xml 5c4b4a7b 
>   repository/pom.xml bb4d1eb4 
>   repository/src/test/java/org/apache/atlas/DBSandboxer.java f4f099a6 
>   repository/src/test/java/org/apache/atlas/TestModules.java df299cef 
>   repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java c44eea3b 
>   repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTestUtils.java 72895125 
>   repository/src/test/java/org/apache/atlas/services/MetricsServiceTest.java ca05cbeb 
> 
> 
> Diff: https://reviews.apache.org/r/65327/diff/2/
> 
> 
> Testing
> -------
> 
> **Build**
> Used the following command line for repeated execution. Executed between 2 to 4 hours and monitoring errors.
> ```
> while sleep 3; do mvn -f repository/pom.xml clean install | grep -v "INFO\|WARNING\|Running"; done
> ```
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>


Re: Review Request 65327: Fix: Repository Unit Tests: Occasional Seemingly Random Failures

Posted by Ashutosh Mestry <am...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65327/
-----------------------------------------------------------

(Updated Jan. 26, 2018, 1:18 a.m.)


Review request for atlas, Apoorv Naik and Madhan Neethiraj.


Changes
-------

Updates include:
- New class _SetupTestVerifier_ added.
- Updated _DSLQueriesTest_ to use the new verifier class.
- Addressed review comments.


Bugs: ATLAS-2378
    https://issues.apache.org/jira/browse/ATLAS-2378


Repository: atlas


Description
-------

**Approach for Analysis**
- Running tests for extended periods did not yield a predictable pattern.
- Exposing exceptions in the setup process helped me progress more quickly.
 
**Analysis & Findings**
- Spurious errors occur during test execution due to Solr, ZK and connectivity between them.
- _AtlasJanusGraphDatabase.cleanup_ occasionally fail due to some exception with BerkeleyDB shutdown. In short, our clean-up of the temporary folders within data directory are seldom succeed.
- TestNG invoking the tests cause occasional problem during setup.

**Fixes/Updates/Refactoring**
- Added guard conditions to AtlasJanusGraphDatabase.cleanup, this reduced exceptions.
- Expose exceptions (as opposed to swallowing them). This will give us a good starting point to analyze.
- Added TestNG’s _SkipException_ where we encounter setup failures. That will avoid showing incorrect test results and we will be able to proceed with builds.
- Got rid of DBSandboxer class as it caused more problems with DB cleanup.

**Possible Further improvement**
- Streamline the clean-up in a single class.


Diffs (updated)
-----

  graphdb/common/src/test/java/org/apache/atlas/graph/GraphSandboxUtil.java b8a9a49a 
  graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphDatabase.java f91226b8 
  pom.xml 5c4b4a7b 
  repository/pom.xml bb4d1eb4 
  repository/src/test/java/org/apache/atlas/DBSandboxer.java f4f099a6 
  repository/src/test/java/org/apache/atlas/TestModules.java df299cef 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java c44eea3b 
  repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTestUtils.java 72895125 
  repository/src/test/java/org/apache/atlas/services/MetricsServiceTest.java ca05cbeb 


Diff: https://reviews.apache.org/r/65327/diff/2/

Changes: https://reviews.apache.org/r/65327/diff/1-2/


Testing
-------

**Build**
Used the following command line for repeated execution. Executed between 2 to 4 hours and monitoring errors.
```
while sleep 3; do mvn -f repository/pom.xml clean install | grep -v "INFO\|WARNING\|Running"; done
```


Thanks,

Ashutosh Mestry