You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Divij Vaidya (Jira)" <ji...@apache.org> on 2022/06/10 08:56:00 UTC
[jira] [Comment Edited] (KAFKA-13943) Fix flaky test QuorumControllerTest.testMissingInMemorySnapshot()
[ https://issues.apache.org/jira/browse/KAFKA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552633#comment-17552633 ]
Divij Vaidya edited comment on KAFKA-13943 at 6/10/22 8:55 AM:
---------------------------------------------------------------
I have fixed the bug which was causing a snapshot with LONG_MAX at [https://github.com/apache/kafka/pull/12224]
Also note that there are other tests such as QuorumControllerTest.testSnapshotOnlyAfterConfiguredMinBytes failing due to same bug
{noformat}
Error Messagejava.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownServerException: java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807Stacktracejava.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownServerException: java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at org.apache.kafka.controller.QuorumControllerTest.testSnapshotOnlyAfterConfiguredMinBytes(QuorumControllerTest.java:691)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
at org.junit.jupiter.engine.extension.TimeoutInvocation.proceed(TimeoutInvocation.java:46)
at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke({noformat}
was (Author: divijvaidya):
I have fixed the bug which was causing a snapshot with LONG_MAX at [https://github.com/apache/kafka/pull/12224]
Also note that
> Fix flaky test QuorumControllerTest.testMissingInMemorySnapshot()
> -----------------------------------------------------------------
>
> Key: KAFKA-13943
> URL: https://issues.apache.org/jira/browse/KAFKA-13943
> Project: Kafka
> Issue Type: Test
> Components: unit tests
> Reporter: Divij Vaidya
> Priority: Major
> Labels: flaky-test
>
> Test failed at [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-12197/3/tests]
> {noformat}
> [2022-05-27 09:34:42,382] INFO [Controller 0] Creating new QuorumController with clusterId wj9LhgPJTV-KYEItgqvtQA, authorizer Optional.empty. (org.apache.kafka.controller.QuorumController:1484)
> [2022-05-27 09:34:42,393] DEBUG [LocalLogManager 0] Node 0: running log check. (org.apache.kafka.metalog.LocalLogManager:479)
> [2022-05-27 09:34:42,394] DEBUG [LocalLogManager 0] initialized local log manager for node 0 (org.apache.kafka.metalog.LocalLogManager:622)
> [2022-05-27 09:34:42,396] INFO [LocalLogManager 0] Node 0: registered MetaLogListener 1774961169 (org.apache.kafka.metalog.LocalLogManager:640)
> [2022-05-27 09:34:42,397] DEBUG [LocalLogManager 0] Node 0: running log check. (org.apache.kafka.metalog.LocalLogManager:479)
> [2022-05-27 09:34:42,397] DEBUG [LocalLogManager 0] Node 0: Executing handleLeaderChange LeaderAndEpoch(leaderId=OptionalInt[0], epoch=1) (org.apache.kafka.metalog.LocalLogManager:520)
> [2022-05-27 09:34:42,398] DEBUG [Controller 0] Executing handleLeaderChange[1]. (org.apache.kafka.controller.QuorumController:438)
> [2022-05-27 09:34:42,398] INFO [Controller 0] Becoming the active controller at epoch 1, committed offset -1, committed epoch -1, and metadata.version 5 (org.apache.kafka.controller.QuorumController:950)
> [2022-05-27 09:34:42,398] DEBUG [Controller 0] Creating snapshot -1 (org.apache.kafka.timeline.SnapshotRegistry:197)
> [2022-05-27 09:34:42,399] DEBUG [Controller 0] Processed handleLeaderChange[1] in 951 us (org.apache.kafka.controller.QuorumController:385)
> [2022-05-27 09:34:42,399] INFO [Controller 0] Initializing metadata.version to 5 (org.apache.kafka.controller.QuorumController:926)
> [2022-05-27 09:34:42,399] INFO [Controller 0] Setting metadata.version to 5 (org.apache.kafka.controller.FeatureControlManager:273)
> [2022-05-27 09:34:42,400] DEBUG [Controller 0] Creating snapshot 9223372036854775807 (org.apache.kafka.timeline.SnapshotRegistry:197)
> [2022-05-27 09:34:42,400] DEBUG [Controller 0] Read-write operation bootstrapMetadata(1863535402) will be completed when the log reaches offset 9223372036854775807. (org.apache.kafka.controller.QuorumController:725)
> [2022-05-27 09:34:42,402] DEBUG append(batch=LocalRecordBatch(leaderEpoch=1, appendTimestamp=10, records=[ApiMessageAndVersion(RegisterBrokerRecord(brokerId=0, incarnationId=kxAT73dKQsitIedpiPtwBw, brokerEpoch=-9223372036854775808, endPoints=[BrokerEndpoint(name='PLAINTEXT', host='localhost', port=9092, securityProtocol=0)], features=[], rack=null, fenced=true) at version 0)]), prevOffset=1) (org.apache.kafka.metalog.LocalLogManager$SharedLogData:247)
> [2022-05-27 09:34:42,402] INFO [Controller 0] Registered new broker: RegisterBrokerRecord(brokerId=0, incarnationId=kxAT73dKQsitIedpiPtwBw, brokerEpoch=-9223372036854775808, endPoints=[BrokerEndpoint(name='PLAINTEXT', host='localhost', port=9092, securityProtocol=0)], features=[], rack=null, fenced=true) (org.apache.kafka.controller.ClusterControlManager:368)
> [2022-05-27 09:34:42,403] WARN [Controller 0] registerBroker: failed with unknown server exception RuntimeException at epoch 1 in 2449 us. Reverting to last committed offset -1. (org.apache.kafka.controller.QuorumController:410)java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807 at org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:190) at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:723) at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173) at java.base/java.lang.Thread.run(Thread.java:833){noformat}
> {noformat}
> Full stack trace
> java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownServerException: java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807
> at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
> at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
> at org.apache.kafka.controller.QuorumControllerTest.registerBrokers(QuorumControllerTest.java:1014)
> at org.apache.kafka.controller.QuorumControllerTest.testMissingInMemorySnapshot(QuorumControllerTest.java:907)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
> at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> at org.junit.jupiter.engine.extension.TimeoutInvocation.proceed(TimeoutInvocation.java:46)
> at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
> at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
> at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
> at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
> at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
> at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
> at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
> at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
> at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
> at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
> at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
> at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
> at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
> at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
> at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
> at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
> at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
> at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
> at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
> at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
> at jdk.proxy1/jdk.proxy1.$Proxy2.stop(Unknown Source)
> at org.gradle.api.internal.tasks.testing.worker.TestWorker$3.run(TestWorker.java:193)
> at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129)
> at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100)
> at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60)
> at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
> at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:133)
> at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:71)
> at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
> at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
> Caused by: org.apache.kafka.common.errors.UnknownServerException: java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807
> Caused by: java.lang.RuntimeException: Can't create a new snapshot at epoch 1 because there is already a snapshot with epoch 9223372036854775807
> at org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:190)
> at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:723)
> at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
> at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
> at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
> at java.base/java.lang.Thread.run(Thread.java:833){noformat}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)