You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2019/08/27 18:09:51 UTC

[GitHub] [hadoop] adoroszlai opened a new pull request #1358: HDDS-2045. Partially started compose cluster left running

adoroszlai opened a new pull request #1358: HDDS-2045. Partially started compose cluster left running
URL: https://github.com/apache/hadoop/pull/1358
 
 
   ## What changes were proposed in this pull request?
   
   If any container in the sample cluster [fails to start](https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L24), all successfully started containers are left running.  This [prevents](https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L59) any further acceptance tests from normal completion.  This is only a minor inconvenience, since acceptance test as a whole fails either way.
   
   This change makes sure the cluster is stopped if startup fails.
   
   https://issues.apache.org/jira/browse/HDDS-2045
   
   ## How was this patch tested?
   
   Temporarily added fake failures in `start_docker_env` and `wait_for_datanodes`, and verified that the cluster is stopped:
   
   ```
   $ ./test.sh
   Removing network ozone_default
   WARNING: Network ozone_default not found.
   Creating network "ozone_default" with the default driver
   Creating ozone_scm_1      ... done
   Creating ozone_datanode_1 ... done
   Creating ozone_datanode_2 ... done
   Creating ozone_datanode_3 ... done
   Creating ozone_om_1       ... done
   0 datanode is up and healthy (until now)
   Stopping ozone_datanode_1 ... done
   Stopping ozone_datanode_3 ... done
   Stopping ozone_om_1       ... done
   Stopping ozone_datanode_2 ... done
   Stopping ozone_scm_1      ... done
   Removing ozone_datanode_1 ... done
   Removing ozone_datanode_3 ... done
   Removing ozone_om_1       ... done
   Removing ozone_datanode_2 ... done
   Removing ozone_scm_1      ... done
   Removing network ozone_default
   ```
   
   Verified that the test succeeds without the fake failure.
   
   ```
   $ ./test.sh
   Removing network ozone_default
   WARNING: Network ozone_default not found.
   Creating network "ozone_default" with the default driver
   Creating ozone_scm_1      ... done
   Creating ozone_om_1       ... done
   Creating ozone_datanode_1 ... done
   Creating ozone_datanode_2 ... done
   Creating ozone_datanode_3 ... done
   0 datanode is up and healthy (until now)
   3 datanodes are up and registered to the scm
   ==============================================================================
   ozone-auditparser
   ==============================================================================
   ozone-auditparser.Auditparser :: Smoketest ozone cluster startup
   ==============================================================================
   Initiating freon to generate data                                     | PASS |
   ------------------------------------------------------------------------------
   Testing audit parser                                                  | PASS |
   ------------------------------------------------------------------------------
   ozone-auditparser.Auditparser :: Smoketest ozone cluster startup      | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   ozone-auditparser                                                     | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-auditparser-om.xml
   ==============================================================================
   ozone-basic :: Smoketest ozone cluster startup
   ==============================================================================
   Check webui static resources                                          | PASS |
   ------------------------------------------------------------------------------
   Start freon testing                                                   | PASS |
   ------------------------------------------------------------------------------
   ozone-basic :: Smoketest ozone cluster startup                        | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-basic-scm.xml
   Stopping ozone_datanode_1 ... done
   Stopping ozone_datanode_3 ... done
   Stopping ozone_datanode_2 ... done
   Stopping ozone_om_1       ... done
   Stopping ozone_scm_1      ... done
   Removing ozone_datanode_1 ... done
   Removing ozone_datanode_3 ... done
   Removing ozone_datanode_2 ... done
   Removing ozone_om_1       ... done
   Removing ozone_scm_1      ... done
   Removing network ozone_default
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org