You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/01 11:40:22 UTC

[GitHub] [pulsar] lhotari opened a new pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

lhotari opened a new pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509


   ### Motivation
   
   - OOMEs can make the build to take very long to complete or to hang.
     It's better to fail fast in tests when OOMEs occur.
   
   - example: https://github.com/apache/pulsar/runs/5371747143?check_suite_focus=true
     - [thread dump](https://jstack.review/?https://gist.github.com/lhotari/63f57f4ce94cc23caaa7833587210971#tda_1_sync_0x00000000c3affce0) 
     - it looks like Zookeeper client has gotten into a bad state because of OOME, however I'm not sure about this.
   
   ### Modifications
   
   - Add `XX:+ExitOnOutOfMemoryError` to test JVM arguments
   - disable ClientDeduplicationFailureTest.testClientDeduplicationCorrectnessWithFailure completely since it leads to OOMEs in certain conditions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari merged pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

Posted by GitBox <gi...@apache.org>.
lhotari merged pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509#issuecomment-1055488552


   > what happens in case of OOM with this flag ? is surefire reporting the name of the broker test ? otherwise it will be very hard to understand why CI failed
   
   The OOME does get logged in surefire report files. IIRC, the error code was 3 for the terminated test JVM. 
   It's actual even harder to understand why CI failed when the JVM keeps on running since a OOME can leave some components in a broken state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509#issuecomment-1055690856


   Just created an issue to track the fact that we disabled a test: https://github.com/apache/pulsar/issues/14523.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509#issuecomment-1055886029


   Here's an explanation and a fix for some of the recent OOMEs in tests: #14524 .
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on pull request #14509: [Build/Tests] Make the test JVM exit if OOME occurs

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on pull request #14509:
URL: https://github.com/apache/pulsar/pull/14509#issuecomment-1055488552


   > what happens in case of OOM with this flag ? is surefire reporting the name of the broker test ? otherwise it will be very hard to understand why CI failed
   
   The OOME does get logged in surefire report files. IIRC, the error code was 3 for the terminated test JVM. This is visible in the console output.
   It's actual even harder to understand why CI failed when the JVM keeps on running since a OOME can leave some components in a broken state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org