You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Tony Reix (JIRA)" <ji...@apache.org> on 2015/02/20 09:54:11 UTC

[jira] [Created] (KAFKA-1970) Several tests are not stable

Tony Reix created KAFKA-1970:
--------------------------------

             Summary: Several tests are not stable
                 Key: KAFKA-1970
                 URL: https://issues.apache.org/jira/browse/KAFKA-1970
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.8.1
         Environment: Several:
- RHEL7.1/x86_64  <=== My reference
- RHEL7.0/x86_64
- Ubuntu /x86_64
- RHEL7.1/PPC64LE
- RHEL7.0/PPC64BE
- OpenJDK 1.7
- IBM JVM 1.7
            Reporter: Tony Reix


I'm porting Kafka 0.8.1 on RHEL 7.1/PPC64LE.
Since it looked that tests were unstable, I've launched the tests on several environments, in order to have a wide view.

I'm using:
 -  ./gradlew build -x signArchives
or:
 -  ./gradlew test -x signArchives


Results seem to show:
  - Tests are unstable everywhere (very few on your Ubuntu/x86_64 test env)
  - IBM JVM shows some more issues than OpenJDK
  - Sometimes, tests are not lauched, with no reason.
        But not on my reference environment (RHEL7.1/x86_64/OPenJDK)


- Open JDK :                         Tests runs results:
  - dorado-vm2 - RHEL7.0/x86_64 :
                                - 238 tests completed, 82 failed
                                - 238 tests completed, 94 failed
                                - BUILD SUCCESSFUL
  - dorado-vm3 - Ubuntu /x86_64 :
                                - BUILD SUCCESSFUL
  - soe01x     - RHEL7.1/x86_64 :
                                - 238 tests completed, 4 failed  x 2 times
  - soe07-vm1  - RHEL7.1/PPC64LE:
                                - BUILD SUCCESSFUL
                                - 238 tests completed, 2 failed
                                - 238 tests completed, 3 failed


- IBM JVM :                         Tests runs results:
  - dorado-vm2 - RHEL7.0/x86_64 :
                                - 1 failed + Tests Blocked
                                - BUILD SUCCESSFUL
  - soe01x     - RHEL7.1/x86_64 :
                                - 238 tests completed, 6 failed
                                - 238 tests completed, 4 failed  x 3 times
                                - 238 tests completed, 5 failed
  - soe07-vm1  - RHEL7.1/PPC64LE:
                                - 238 tests completed, 1 failed
                                - BUILD SUCCESSFUL
  - laurel6    - RHEL7.0/PPC64BE:
                                - 238 tests completed, 1 failed
                                - BUILD SUCCESSFUL

=========================================================

I think that these tests are unstable:

kafka.server.LogRecoveryTest      > testHWCheckpointNoFailuresMultipleLogSegments
kafka.server.LogRecoveryTest      > testHWCheckpointWithFailuresMultipleLogSegments
kafka.admin.DeleteTopicTest       > testAutoCreateAfterDeleteTopic
kafka.admin.DeleteTopicTest       > testPreferredReplicaElectionDuringDeleteTopic
kafka.server.RequestPurgatoryTest > testRequestExpiry

=========================================================

These tests are failing often (always on my reference environment (RHEL7.1/x86_64/OpenJDK), but not on Ubuntu) :
kafka.server.LogOffsetTest        > testEmptyLogsGetOffsets
kafka.server.LogOffsetTest        > testGetOffsetsBeforeLatestTime
kafka.server.LogOffsetTest        > testGetOffsetsBeforeEarliestTime
kafka.server.LogOffsetTest        > testGetOffsetsBeforeNow 

=========================================================

As an example or random failures, on my reference environment (RHEL7.1/x86_64/OpenJDK) , the test:
  kafka.server.LogRecoveryTest > testHWCheckpointNoFailuresMultipleLogSegments
failed 2 times out of 12.

=========================================================

On Ubuntu/x86_64/OpenJDK , out of 3 runs of :
   gradlew test -x signArchive
I've got:
 - 3 Full success
 - 1 launch that did NOT run the tests

=========================================================

Still on x86_64/OpenJDK , I'm surprised to always have 4 failures with RHEL 7.1 and none on Ubuntu.
Some issue within RHEL 7.1 and/or Java ?

=========================================================

On RHEL 7.1 / PPC64LE / IBM JVM, I see a wide unstability.

I've run 12 tests.
Parsing them about "FAILED" tests with:
 for i in 1 2 10 11 12 13 14 15 16 17 18 19; do $i; N=`grep FAILED gradlew.build.IBMJVM.res$i | wc -l`; echo $i":"$N; done 
gave:

1:3
2:0
10:0
11:3
12:4
13:49
14:3
15:4
16:0
17:0
18:0
19:0

Doing the same about "PASSED" tests, I've got:
1:372
2:238
10:0
11:372
12:236
13:191
14:237
15:236
16:238
17:0
18:0
19:0

Showing:
 - 4 launches did NOT run the tests
 - 2 launches were SUCCESSFUL
 - for the others, there were 3, 4 or 49 FAILED tests

=========================================================

Conclusions: I think that it would be useful for Kafka project:
- to run tests with IBM JVM in addition to OpenJDK.
- to run tests on a different Linux distrib than Ubuntu: RHEL .
- to check (by running it many times) that the following test is stable in your standard test environment:
   kafka.server.LogRecoveryTest > testHWCheckpointNoFailuresMultipleLogSegments

On my side, there are other causes of unstability in my specific environments (PPC64) that I have to study.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)