You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/03/25 19:06:05 UTC

[GitHub] [hadoop-ozone] sodonnel commented on issue #719: HDDS-3270. Allow safemode listeners to be notified when some precheck rules pass

sodonnel commented on issue #719: HDDS-3270. Allow safemode listeners to be notified when some precheck rules pass
URL: https://github.com/apache/hadoop-ozone/pull/719#issuecomment-604029510
 
 
   Getting intermitted unit test failure on:
   
   ```
   [ERROR] Errors: 
   [ERROR]   TestDeadNodeHandler.testOnMessage:156 ? Timeout timeout: after 120000 millis
   ```
   
   However it passes locally and passes sometimes on github.
   
   Acceptance failure is:
   
   ```
   Output:  /tmp/smoketest/ozonesecure-mr/result/robot-ozonesecure-mr-ozonesecure-mr-kinit-hadoop-rm.xml
   ==============================================================================
   ozonesecure-mr-mapreduce :: Execute MR jobs                                   
   ==============================================================================
   Execute PI calculation                                                | FAIL |
   1 != 0
   ------------------------------------------------------------------------------
   Execute WordCount                                                     | FAIL |
   1 != 0
   ------------------------------------------------------------------------------
   ozonesecure-mr-mapreduce :: Execute MR jobs                           | FAIL |
   ```
   
   This passes locally. The actual seems to be some sort of space issue:
   
   ```
   2020-03-25 16:36:06 INFO  Job:1574 - The url to track the job: http://rm:8088/proxy/application_1585154084804_0002/
   2020-03-25 16:36:06 INFO  Job:1619 - Running job: job_1585154084804_0002
   2020-03-25 16:36:08 INFO  Job:1640 - Job job_1585154084804_0002 running in uber mode : false
   2020-03-25 16:36:08 INFO  Job:1647 -  map 0% reduce 0%
   2020-03-25 16:36:09 INFO  Job:1660 - Job job_1585154084804_0002 failed with state FAILED due to: Application application_1585154084804_0002 failed 2 times due to AM Container for appattempt_1585154084804_0002_000002 exited with  exitCode: -1000
   Failing this attempt.Diagnostics: [2020-03-25 16:36:08.025]No space available in any of the local directories.
   For more detailed output, check the application tracking page: http://rm:8088/cluster/app/application_1585154084804_0002 Then click on links to logs of each attempt.
   . Failing the application.
   ```
   
   IT-Client failure seems intermittent too:
   
   ```
   [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 174.231 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestCommitWatcher
   [ERROR] testReleaseBuffers(org.apache.hadoop.ozone.client.rpc.TestCommitWatcher)  Time elapsed: 106.835 s  <<< ERROR!
   java.util.concurrent.ExecutionException: org.apache.ratis.protocol.AlreadyClosedException: SlidingWindow$Client:client-17BDABCAD80C->RAFT is closed.
   	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
   	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
   	at org.apache.hadoop.ozone.client.rpc.TestCommitWatcher.testReleaseBuffers(TestCommitWatcher.java:208)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org