You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sagar Rao (Jira)" <ji...@apache.org> on 2022/08/20 17:43:00 UTC

[jira] [Commented] (KAFKA-14000) Kafka-connect standby server shows empty tasks list

    [ https://issues.apache.org/jira/browse/KAFKA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582346#comment-17582346 ] 

Sagar Rao commented on KAFKA-14000:
-----------------------------------

hi [~xinzou] , Thanks for filing the bug. One thing that I wanted to know was how many partitions does the status topic have in your case? Generally it's recommended to have a single partition for the status topic to maintain the ordering of events. 

> Kafka-connect standby server shows empty tasks list
> ---------------------------------------------------
>
>                 Key: KAFKA-14000
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14000
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.6.0
>            Reporter: Xinyu Zou
>            Assignee: Sagar Rao
>            Priority: Major
>         Attachments: kafka-connect-trace.log
>
>
> I'm using Kafka-connect distributed mode. There're two servers. One active and one standby. The standby server sometimes shows empty tasks list in status rest API response.
> curl host:8443/connectors/name1/status
> {code:java}
> {
>     "connector": {
>         "state": "RUNNING",
>         "worker_id": "1.2.3.4:10443"
>     },
>     "name": "name1",
>     "tasks": [],
>     "type": "source"
> } {code}
> I enabled TRACE log and checked. As required, the connect-status topic is set to cleanup.policy=compact. But messages in the topic won't be compacted timely. They will be compacted in a specific interval. So usually there're more than one messages with same key. E.g. When kafka-connect is launched there's no connector running. And then we start a new connector. Then there will be two messages in connect-status topic:
> status-task-name1 : state=RUNNING, workerId='10.251.170.166:10443', generation=100
> status-task-name1 : _<emtpy>_
> Please check the log file [^kafka-connect-trace.log]. We can see that the tasks status was removed finally. But actually the empty status was not the newest message in topic connect-status.
>  
> When reading status from connect-status topic, it doesn't sort messages by generation.
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java]
> So I think this could be improved. We can either sort the messages after poll or compare generation value before we choose correct status message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)