You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/21 15:22:02 UTC

[GitHub] [pulsar] lhotari opened a new pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

lhotari opened a new pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309


   ### Motivation
   
   The cpp/python build is very flaky in CI. It's hard to investigate the problem when there isn't much log output or the output is hard to access. 
   
   ### Modifications
   
   - enable gtest retries by passing `--retry_failed=1` parameter
   - adjust number of parallel workers based on the number of CPU cores available. Use a formula of `min(2 * number of cores, 10)`
   - capture test logs to files and upload the archive so that it can be downloaded as a zip file if the build fails
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#issuecomment-824226669


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#issuecomment-824150049


   @merlimat @aahmed-se please review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] BewareMyPower commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
BewareMyPower commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683399213



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"
+RES=$?
+if [ $RES -ne 0 ]; then

Review comment:
       This check would never success. The error code returned by `$?` is the docker container's exit code, not the command's exit code. Even if `run-unit-tests.sh` exited with a non-zero code, the `RES` would still and always be 0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683715524



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"

Review comment:
       @BewareMyPower yes that's true that `| cat` causes the issue. it would require `set -o pipefail;` for the exit code to propagate when there's `| cat` . 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] BewareMyPower commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
BewareMyPower commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683556882



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"
+RES=$?
+if [ $RES -ne 0 ]; then

Review comment:
       Yeah, it's a little weird that when I run the `docker run` with `run-unit-tests.sh` that failed, then print `echo $?`, it printed `1`. However, when I run `docker-tests.sh` and added a `echo $RES` after line 69, the tests failed but it printed `0`. I'll continue to dig into this issue.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683543682



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"
+RES=$?
+if [ $RES -ne 0 ]; then

Review comment:
       Based on documentation, docker run will return the exit code. Docs: https://docs.docker.com/engine/reference/run/#exit-status




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683722629



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"

Review comment:
       fixed by PR #11575




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] BewareMyPower commented on a change in pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
BewareMyPower commented on a change in pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309#discussion_r683626563



##########
File path: pulsar-client-cpp/docker-tests.sh
##########
@@ -58,4 +60,26 @@ done
 
 # Start 2 Pulsar standalone instances (one with TLS and one without)
 # and execute the tests
-$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
+set +e
+DISABLE_COLOR_OUTPUT=""
+if [ "$GTEST_COLOR" = "no" ]; then
+  DISABLE_COLOR_OUTPUT="| cat"
+fi
+$DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} $DISABLE_COLOR_OUTPUT"

Review comment:
       The real cause of broken CI is here. `DISABLE_COLOR_OUTPUT` is `| cat` by default, then the command becomes
   
   ```bash
   $DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests} | cat"
   ```
   
   The exit code is what `cat` exits, so it's always 0. However, if you changed it to
   
   ```bash
   $DOCKER_CMD bash -c "cd /pulsar/pulsar-client-cpp && ./run-unit-tests.sh ${tests}"
   ```
   
   The exit code would become 1, which is what `./run-unit-tests.sh` exits.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] aahmed-se merged pull request #10309: [Tests] Fix cpp build flakiness by enabling gtest retries, also improve logging in CI

Posted by GitBox <gi...@apache.org>.
aahmed-se merged pull request #10309:
URL: https://github.com/apache/pulsar/pull/10309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org