You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Alexandre Vermeerbergen (JIRA)" <ji...@apache.org> on 2016/12/01 15:23:59 UTC

[jira] [Commented] (KAFKA-4443) Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover

    [ https://issues.apache.org/jira/browse/KAFKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712243#comment-15712243 ] 

Alexandre Vermeerbergen commented on KAFKA-4443:
------------------------------------------------

Hello,

Could this fix be back-ported to Kafka 0.9.0.2 please, or better, as a patch for 0.9.0.1 ?
We had repeated occurrences in the past weeks with Kafka 0.9.0.1

Best regards,
Alex


> Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover
> -----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4443
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4443
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.1.0
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>              Labels: reliability
>             Fix For: 0.10.2.0
>
>
> Currently in onControllerFailover(), controller will startup replicaStatemachine and partitionStateMachine before invoking sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq). However, if a broker starts right after controller election, the LeaderAndIsrRequest sent to follower partitions on this broker will all be ignored because broker doesn't know the leaders are alive. 
> To fix this problem, in onControllerFailover(), controller should send UpdateMetadataRequest to brokers after initializeControllerContext() but before it starts replicaStatemachine and partitionStateMachine. The first MetadatUpdateRequest will include list of live broker. Although it will not include partition leader information, it is OK because we will always send MetadataUpdateRequest again when we send LeaderAndIsrRequest during replicaStateMachine.startup() and partitionStateMachine.startup().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)