You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Федор Чернилин <in...@gmail.com> on 2019/03/25 09:38:07 UTC

Source connectors stops working after the broker crashes.

Configuration:

3 br

1 zk

1 connect worker(distributed mode)

8 connectors(sink and source)


Description of connect topics:
[image: Снимок экрана 2019-03-22 в 17.00.56.png]


Environment:

K8s, gcloud, confluent images


After crash of first broker there are several messages that broker is not
available. During it’s restart, there is brokers rebalancing which changes
the leaders and replicators of topic partitions and after that all is fine.
Following messages appear when broker failed:

INFO [Worker clientId=connect-1, groupId=connect-cluster] Group coordinator
kafka-broker-2-int:29092 (id: 2147483645 rack: null) is unavailable or
invalid, will attempt rediscovery
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)

INFO [Worker clientId=connect-1, groupId=connect-cluster] Group coordinator
kafka-broker-2-int:29092 (id: 2147483645 rack: null) is unavailable or
invalid, will attempt rediscovery
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)

INFO [Worker clientId=connect-1, groupId=connect-cluster] Attempt to
heartbeat failed since coordinator kafka-broker-2-int:29092 (id: 2147483645
rack: null) is either not started or not valid.
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)


Connectors use only two other brokers and no warning messages. But when
another broker is crashed warning messages are not stopped and connect
offset consumer fails with timeout error.

INFO [Consumer clientId=consumer-2, groupId=connect-cluster] Error sending
fetch request (sessionId=1683702723, epoch=INITIAL) to node 1:
org.apache.kafka.common.errors.TimeoutException: Failed to send request
after 30000 ms.. (org.apache.kafka.clients.FetchSessionHandler)

and one

ERROR Unexpected exception in Thread[KafkaBasedLog Work Thread -
connect-offset-storage-topic,5,main]
(org.apache.kafka.connect.util.KafkaBasedLog)».

So even if broker is restarted, consumer already failed, and connectors are
not able to get offsets. It seems, that there is no relation to specific
broker instance, due to this error might occur with different brokers.


What can cause these problem?

Thanks.