You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "zhangzhisheng (Jira)" <ji...@apache.org> on 2020/12/03 08:26:00 UTC

[jira] [Comment Edited] (KAFKA-3042) updateIsr should stop after failed several times due to zkVersion issue

    [ https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241383#comment-17241383 ] 

zhangzhisheng edited comment on KAFKA-3042 at 12/3/20, 8:25 AM:
----------------------------------------------------------------

using kafka_2.12-2.4.1,zookeeper-3.5.7

3 ZKs 3 Broker cluster, topic replication factor is 2
 linux (redhat) xfs kafka logs on single local disk

error info 

 
{code:java}
// code placeholder
fka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition __consumer_offsets-22 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition fcp-FFF-account-201807131719-2 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition fcp-PCP-INSTRANSACTIONPOLICY-2018079116-0 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition LOAN_FAIL_MANAGE-202011231831270534-1 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition fcp-FFF-LOANTXNSUB-201806271129-0 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition __consumer_offsets-4 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition __consumer_command_request-5 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,078] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition fcp-creditcore-loan-trans-201809112022-2 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,079] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition fcp-CREDITCORE-LOAN-TRANS-20180791126-0 (kafka.controller.KafkaController) kafka_2.12-2.4.1/logs/controller.log.2020-11-28-01:[2020-11-28 01:51:02,079] ERROR [Controller id=0] Error completing replica leader election (PREFERRED) for partition __consumer_offsets-7 (kafka.controller.KafkaController)
{code}


was (Author: zhangzs):
using kafka_2.12-2.4.1,zookeeper-3.5.7

3 ZKs 3 Broker cluster, topic replication factor is 2
linux (redhat) xfs kafka logs on single local disk

error info 
{code:java}
// code placeholder
[2020-12-01 15:38:22,237] INFO [Partition  topic-cs-201907181035-0 broker=2] Cached zkVersion 59 not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2020-12-01 15:38:22,237] INFO [Partition  topic-repay-plan-detail-201809112057-5 broker=2] Shrinking ISR from 2,0 to 2. Leader: (highWatermark: 173252090, endOffset: 173426233). Out of sync replicas: (brokerId: 0, endOffset: 173252090). (kafka.cluster.Partition)
[2020-12-01 15:38:22,238] INFO [Partition  topic-repay-plan-detail-201809112057-5 broker=2] Cached zkVersion 81 not equal to that in zookeeper,skip updating ISR (kafka.cluster.Partition)
[2020-12-01 15:38:22,239] INFO [Partition  topic-pay-flow-201810181631-1 broker=2] Shrinking ISR from 2,0 to 2. Leader: (highWatermark: 334799502, endOffset: 335281045). Out of sync replicas: (brokerId: 0, endOffset: 334799502). (kafka.cluster.Partition)
[2020-12-01 15:38:22,240] INFO [Partition  topic-pay-flow-201810181631-1 broker=2] Cached zkVersion 85 not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2020-12-01 15:38:22,240] INFO [Partition  topic-repay-plan-detail-201809112057-1 broker=2] Shrinking ISR from 2,0 to 2. Leader: (highWatermark: 302761557, endOffset: 302935719). Out of sync replicas: (brokerId: 0, endOffset: 302761557). (kafka.cluster.Partition)
[2020-12-01 15:38:22,242] INFO [Partition  topic-repay-plan-detail-201809112057-1 broker=2] Cached zkVersion 90 not equal to that in zookeeper,skip updating ISR (kafka.cluster.Partition)
{code}

> updateIsr should stop after failed several times due to zkVersion issue
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-3042
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3042
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.10.0.0, 2.4.1
>         Environment: jdk 1.7
> centos 6.4
>            Reporter: Jiahongchao
>            Assignee: Dong Lin
>            Priority: Critical
>              Labels: reliability
>         Attachments: controller.log, server.log.2016-03-23-01, state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's a follower.
> So after several failed tries, it need to find out who is the leader



--
This message was sent by Atlassian Jira
(v8.3.4#803005)