You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Luke Chen (Jira)" <ji...@apache.org> on 2021/06/04 13:21:00 UTC

[jira] [Created] (KAFKA-12892) InvalidACLException thrown in tests caused jenkins build unstable

Luke Chen created KAFKA-12892:
---------------------------------

             Summary: InvalidACLException thrown in tests caused jenkins build unstable
                 Key: KAFKA-12892
                 URL: https://issues.apache.org/jira/browse/KAFKA-12892
             Project: Kafka
          Issue Type: Bug
            Reporter: Luke Chen
         Attachments: image-2021-06-04-21-05-57-222.png

In KAFKA-12866, we fixed the issue that Kafka required ZK root access even when using a chroot. But after the PR merged (build #183), trunk build keeps failing at least one test group (mostly, JDK 15 and Scala 2.13). The build result will said nothing useful:
{code:java}
> Task :core:integrationTest FAILED
[2021-06-04T03:19:18.974Z] 
[2021-06-04T03:19:18.974Z] FAILURE: Build failed with an exception.
[2021-06-04T03:19:18.974Z] 
[2021-06-04T03:19:18.974Z] * What went wrong:
[2021-06-04T03:19:18.974Z] Execution failed for task ':core:integrationTest'.
[2021-06-04T03:19:18.974Z] > Process 'Gradle Test Executor 128' finished with non-zero exit value 1
[2021-06-04T03:19:18.974Z]   This problem might be caused by incorrect test process configuration.
[2021-06-04T03:19:18.974Z]   Please refer to the test execution section in the User Manual at https://docs.gradle.org/7.0.2/userguide/java_testing.html#sec:test_execution
{code}
 

After investigation, I found the failed tests is because there are many `InvalidACLException` thrown during the tests, ex:

 
{code:java}
GssapiAuthenticationTest > testServerNotFoundInKerberosDatabase() FAILED
[2021-06-04T02:25:45.419Z]     org.apache.zookeeper.KeeperException$InvalidACLException: KeeperErrorCode = InvalidACL for /config/topics/__consumer_offsets
[2021-06-04T02:25:45.419Z]         at org.apache.zookeeper.KeeperException.create(KeeperException.java:128)
[2021-06-04T02:25:45.419Z]         at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
[2021-06-04T02:25:45.419Z]         at kafka.zookeeper.AsyncResponse.maybeThrow(ZooKeeperClient.scala:583)
[2021-06-04T02:25:45.419Z]         at kafka.zk.KafkaZkClient.createRecursive(KafkaZkClient.scala:1729)
[2021-06-04T02:25:45.419Z]         at kafka.zk.KafkaZkClient.createOrSet$1(KafkaZkClient.scala:366)
[2021-06-04T02:25:45.419Z]         at kafka.zk.KafkaZkClient.setOrCreateEntityConfigs(KafkaZkClient.scala:376)
[2021-06-04T02:25:45.419Z]         at kafka.zk.AdminZkClient.createTopicWithAssignment(AdminZkClient.scala:109)
[2021-06-04T02:25:45.419Z]         at kafka.zk.AdminZkClient.createTopic(AdminZkClient.scala:60)
[2021-06-04T02:25:45.419Z]         at kafka.utils.TestUtils$.$anonfun$createTopic$1(TestUtils.scala:357)
[2021-06-04T02:25:45.419Z]         at kafka.utils.TestUtils$.createTopic(TestUtils.scala:848)
[2021-06-04T02:25:45.419Z]         at kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:428)
[2021-06-04T02:25:45.419Z]         at kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:109)
[2021-06-04T02:25:45.419Z]         at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:84)
[2021-06-04T02:25:45.419Z]         at kafka.server.GssapiAuthenticationTest.setUp(GssapiAuthenticationTest.scala:68)
{code}
 

Log can be found [here|[https://ci-builds.apache.org/blue/rest/organizations/jenkins/pipelines/Kafka/pipelines/kafka/branches/trunk/runs/195/nodes/14/steps/145/log/?start=0]|https://ci-builds.apache.org/blue/rest/organizations/jenkins/pipelines/Kafka/pipelines/kafka/branches/trunk/runs/195/nodes/14/steps/145/log/?start=0].]

After tracing back, I found it could because we add a test in the KAFKA-12866 to lock root access in zookeeper, but somehow it didn't unlock after the test in testChrootExistsAndRootIsLocked. Also, while all the InvalidACLException failed tests happened right after testChrootExistsAndRootIsLocked not long. Ex: below testChrootExistsAndRootIsLocked completed at 02:24:30, and the above failed test is at 02:25:45 (and following more than 10 tests with the same InvalidACLException. 
{code:java}
[2021-06-04T02:24:29.370Z] ZkClientAclTest > testChrootExistsAndRootIsLocked() STARTED
[2021-06-04T02:24:30.321Z] 
[2021-06-04T02:24:30.321Z] ZkClientAclTest > testChrootExistsAndRootIsLocked() PASSED{code}
 

!image-2021-06-04-21-05-57-222.png|width=489,height=1111!

We should have further investigation to see how to improve the test to avoid breaking the build. Before that, we can disable the test first. Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)