You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "A. Sophie Blee-Goldman (Jira)" <ji...@apache.org> on 2021/07/13 19:58:00 UTC

[jira] [Comment Edited] (KAFKA-8529) Flakey test ConsumerBounceTest#testCloseDuringRebalance

    [ https://issues.apache.org/jira/browse/KAFKA-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380138#comment-17380138 ] 

A. Sophie Blee-Goldman edited comment on KAFKA-8529 at 7/13/21, 7:57 PM:
-------------------------------------------------------------------------

This has been failing _very_ frequently as of late, for example [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11009/4/tests].

I took a brief looks at the logs and it appears to be an issue with the connection, or possibly something to do with the incremental fetch (not sure if the *FetchSessionTopicIdException* is a red herring or not):
{code:java}
[2021-07-13 12:31:46,774] WARN [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1829936913, epoch=1), rackId=) (kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException: The fetch session encountered inconsistent topic ID usage[2021-07-13 12:31:47,002] WARN [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=2081224068, epoch=1), rackId=) (kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException: The fetch session encountered inconsistent topic ID usage[2021-07-13 12:31:48,545] WARN [Consumer clientId=ConsumerTestConsumer, groupId=group1] Close timed out with 3 pending requests to coordinator, terminating client connections (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1024)[2021-07-13 12:31:48,549] WARN [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=0, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), topic-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1532743228, epoch=INITIAL), rackId=) (kafka.server.ReplicaFetcherThread:72)java.io.IOException: Connection to 2 was disconnected before the response was read 
    at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
    at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:109)
    at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:219)
    at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
    at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:137)
    at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:136)
    at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:136)
    at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:119)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
{code}
 Wondering if we should bump this the priority on this? Can someone more familiar with this test maybe take a few minutes to glance over these logs and chime in on whether this is potentially concerning or not? cc [~hachikuji] [~cmccabe]


was (Author: ableegoldman):
This has been failing _very_ frequently as of late, for example [https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11009/4/tests].

I took a brief looks at the logs and it appears to be an issue with the connection, or possibly something to do with the incremental fetch (not sure if this warning is a red herring or not):

 
{code:java}
[2021-07-13 12:31:46,774] WARN [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1829936913, epoch=1), rackId=) (kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException: The fetch session encountered inconsistent topic ID usage[2021-07-13 12:31:47,002] WARN [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=2081224068, epoch=1), rackId=) (kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException: The fetch session encountered inconsistent topic ID usage[2021-07-13 12:31:48,545] WARN [Consumer clientId=ConsumerTestConsumer, groupId=group1] Close timed out with 3 pending requests to coordinator, terminating client connections (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1024)[2021-07-13 12:31:48,549] WARN [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=0, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), topic-1=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1532743228, epoch=INITIAL), rackId=) (kafka.server.ReplicaFetcherThread:72)java.io.IOException: Connection to 2 was disconnected before the response was read 
    at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
    at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:109)
    at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:219)
    at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
    at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:137)
    at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:136)
    at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:136)
    at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:119)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
{code}
 

Wondering if we should bump this the priority on this? Can someone more familiar with this test maybe take a few minutes to glance over these logs and chime in on whether this is potentially concerning or not? cc [~hachikuji] [~cmccabe]

> Flakey test ConsumerBounceTest#testCloseDuringRebalance
> -------------------------------------------------------
>
>                 Key: KAFKA-8529
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8529
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>            Reporter: Boyang Chen
>            Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5450/consoleFull]
>  
> *16:16:10* kafka.api.ConsumerBounceTest > testCloseDuringRebalance STARTED*16:16:22* kafka.api.ConsumerBounceTest.testCloseDuringRebalance failed, log available in /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.testCloseDuringRebalance.test.stdout*16:16:22* *16:16:22* kafka.api.ConsumerBounceTest > testCloseDuringRebalance FAILED*16:16:22*     java.lang.AssertionError: Rebalance did not complete in time*16:16:22*         at org.junit.Assert.fail(Assert.java:89)*16:16:22*         at org.junit.Assert.assertTrue(Assert.java:42)*16:16:22*         at kafka.api.ConsumerBounceTest.waitForRebalance$1(ConsumerBounceTest.scala:402)*16:16:22*         at kafka.api.ConsumerBounceTest.checkCloseDuringRebalance(ConsumerBounceTest.scala:416)*16:16:22*         at kafka.api.ConsumerBounceTest.testCloseDuringRebalance(ConsumerBounceTest.scala:379)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)