You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Arjun Satish (JIRA)" <ji...@apache.org> on 2019/02/08 09:29:00 UTC

[jira] [Updated] (KAFKA-7909) Coordinator changes cause Connect integration test to fail

     [ https://issues.apache.org/jira/browse/KAFKA-7909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arjun Satish updated KAFKA-7909:
--------------------------------
    Description: 
We recently introduced integration tests in Connect. This test spins up one or more Connect workers along with a Kafka broker and Zk in a single process and attempts to move records using a Connector. In the Example Integration Test, we spin up three workers each hosting a Connector task that consumes records from a Kafka topic. When the connector starts up, it may go through multiple rounds of rebalancing. We notice the following two problems in the last few days:
 # After members join a group, there are no pendingMembers remaining, but the join group method does not complete, and send these members a signal that they are not ready to start consuming from their respective partitions.
 # Because of quick rebalances, a consumer might have started a group, but Connect starts  a rebalance, after we which we create three new instances of the consumer (one from each worker/task). But the group coordinator seems to have 4 members in the group. This causes the JoinGroup to indefinitely stall. 

Even though this ticket is described in the connect of Connect, it may be applicable to general consumers.

  was:
We recently introduced integration tests in Connect. This test spins up one or more Connect workers along with a Kafka broker and Zk in a single process and attempts to move records using a Connector. In the Example Integration Test, we spin up three workers each hosting a Connector task that consumes records from a Kafka topic. When the connector starts up, it may go through multiple rounds of rebalancing. We notice the following two problems in the last few days:
 # After members join a group, there are no pendingMembers remaining, but the join group method does not complete, and send these members a signal that they are not ready to start consuming from their respective partitions.
 # Because of quick rebalances, a consumer might have started a group, but Connect starts  a rebalance, after we which we create three new instances of the consumer (one from each worker/task). But the group coordinator seems to have 4 members in the group. This causes the JoinGroup to indefinitely stall. 


> Coordinator changes cause Connect integration test to fail
> ----------------------------------------------------------
>
>                 Key: KAFKA-7909
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7909
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Arjun Satish
>            Priority: Blocker
>             Fix For: 2.2.0
>
>
> We recently introduced integration tests in Connect. This test spins up one or more Connect workers along with a Kafka broker and Zk in a single process and attempts to move records using a Connector. In the Example Integration Test, we spin up three workers each hosting a Connector task that consumes records from a Kafka topic. When the connector starts up, it may go through multiple rounds of rebalancing. We notice the following two problems in the last few days:
>  # After members join a group, there are no pendingMembers remaining, but the join group method does not complete, and send these members a signal that they are not ready to start consuming from their respective partitions.
>  # Because of quick rebalances, a consumer might have started a group, but Connect starts  a rebalance, after we which we create three new instances of the consumer (one from each worker/task). But the group coordinator seems to have 4 members in the group. This causes the JoinGroup to indefinitely stall. 
> Even though this ticket is described in the connect of Connect, it may be applicable to general consumers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)