You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Goltseva Taisiia (Jira)" <ji...@apache.org> on 2020/09/09 11:32:00 UTC

[jira] [Commented] (KAFKA-10253) Kafka Connect gets into an infinite rebalance loop

    [ https://issues.apache.org/jira/browse/KAFKA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192798#comment-17192798 ] 

Goltseva Taisiia commented on KAFKA-10253:
------------------------------------------

Hi, guys!

We had the same problem. In our case the problem was with *group.id* parameter. It was not unique across Kafka Connect clusters. We found a cluster with the same *group.id*, changed it and everything returned to normal.

> Kafka Connect gets into an infinite rebalance loop
> --------------------------------------------------
>
>                 Key: KAFKA-10253
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10253
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.5.0
>            Reporter: Konstantin Lalafaryan
>            Priority: Blocker
>
> Hello everyone!
>  
> We are running kafka-connect cluster  (3 workers) and very often it gets into an infinite rebalance loop.
>  
> {code:java}
> 2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,733 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655831 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655831 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,736 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655832 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655832 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,740 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655833 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655833 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,744 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655834 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655834 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,748 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655835 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655835 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,751 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655836 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655836 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,755 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655837 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655837 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,759 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655838 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655838 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,763 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655839 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655839 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,768 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655840 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation
> {code}
>  It is happening in all 3 workers.
>  
> And in the broker side we can see following:
> {code:java}
> 2020-07-09 16:39:46,260 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127279 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-0]
> 2020-07-09 16:39:46,261 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127280 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,262 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127280 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,265 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127280 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,266 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127281 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-6]
> 2020-07-09 16:39:46,267 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127281 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,270 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127281 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,271 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127282 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-6]
> 2020-07-09 16:39:46,272 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127282 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,275 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127282 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-3]
> 2020-07-09 16:39:46,276 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127283 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,277 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127283 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,280 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127283 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,281 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127284 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,282 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127284 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-3]
> 2020-07-09 16:39:46,285 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127284 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,286 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127285 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-4]
> 2020-07-09 16:39:46,287 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127285 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
> {code}
>  
> Any feedback is appreciated!
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)