You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Dong Lin (JIRA)" <ji...@apache.org> on 2016/11/30 02:55:59 UTC

[jira] [Comment Edited] (KAFKA-4443) Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover

    [ https://issues.apache.org/jira/browse/KAFKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707330#comment-15707330 ] 

Dong Lin edited comment on KAFKA-4443 at 11/30/16 2:55 AM:
-----------------------------------------------------------

[~junrao] Sure. I just updated the description to correct the typo. What I mean is that, if a broker starts right after controller election, the LeaderAndIsrRequest will be ignored because the broker doesn't have the needed information (e.g. port) of live brokers.

As for (2), I think this is probably the same issue reported in KAFKA-3042. All phenomena described in KAFKA-3042 can be caused by the bug fixed in this JIRA. Actually, you described exactly the same fix applied in this JIRA 7 months ago, i.e. "... to fix this particular issue, the simplest approach is to send UpdateMetadataRequest first during controller failover".

As of current design of controller, I prefer the solution where controller sends MetadataUpdateRequest without LeaderAndIsrRequset. Broker will handle MedataDataUpdateRequest in the following steps: 1) update cache with live broker info extracted from MetadataUpdateRequest, 2) reconstruct LeaderAndIsrRequest from MetadataUpdateRequest and process it, and 3) update cache with partition information extracted from MetadataUpdateRequest. This solution is simple and doesn't require wire protocol change. And it is strictly better than current implementation because we no longer have to send MetadataUpdateRequest before LeaderAndIsrRequest. 

But I am not 100% sure this is long term solution because it relies on existing implementation detail where controller always send MetadataUpdateRequest after LeaderAndIsrRequest. In theory this may not be the case if controller is re-designed. For example, we may want to send MetadataUpdateRequest only after Controller has received LeaderAndIsrResponse with success. The idea is to expose new external state to user only after internal state change is completed.

If we don't adopt the solution above which uses MetadataUpateRequest as combination of LeaderAndIsrRequest + MetadataUpdateRequest, then I think we should include endpoints of all leaders in the LeaderAndIsrRequest so that LeaderAndIsrRequest can provide enough information on its own to switch broker between leader and follower.


was (Author: lindong):
[~junrao] Sure. I just updated the description to correct the typo. What I mean is that, if a broker starts right after controller election, the LeaderAndIsrRequest will be ignored because the broker doesn't have the needed information (e.g. port) of live brokers.

As for (2), I think this is probably the same issue reported in KAFKA-3042. All phenomena described in KAFKA-3042 can be caused by the bug fixed in this JIRA. Actually, you described exactly the same fix applied in this JIRA 7 months ago, i.e. "... to fix this particular issue, the simplest approach is to send UpdateMetadataRequest first during controller failover".

As of current design of controller, I prefer the solution where controller sends MetadataUpdateRequest without LeaderAndIsrRequset. Broker will handle MedataDataUpdateRequest in the following steps: 1) update cache with live broker info extracted from MetadataUpdateRequest, 2) reconstruct LeaderAndIsrRequest from MetadataUpdateRequest and process it, and 3) update cache with partition information extracted from MetadataUpdateRequest. This solution is simple and doesn't require wire protocol change. And it is strictly better than current implementation because we no longer have to send MetadataUpdateRequest before LeaderAndIsrRequest. 

But I am not 100% sure this is long term solution because it relies on existing implementation detail where controller always send MetadataUpdateRequest after LeaderAndIsrRequest. In theory this may not be the case if controller is re-designed. For example, we may want to send MetadataUpdateRequest only after Controller has received LeaderAndIsrResponse with success. The idea is to expose new external state to user only after internal state change is completed.

> Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover
> -----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4443
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4443
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.1.0
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>              Labels: reliability
>             Fix For: 0.10.1.1
>
>
> Currently in onControllerFailover(), controller will startup replicaStatemachine and partitionStateMachine before invoking sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq). However, if a broker starts right after controller election, the LeaderAndIsrRequest sent to follower partitions on this broker will all be ignored because broker doesn't know the leaders are alive. 
> To fix this problem, in onControllerFailover(), controller should send UpdateMetadataRequest to brokers after initializeControllerContext() but before it starts replicaStatemachine and partitionStateMachine. The first MetadatUpdateRequest will include list of live broker. Although it will not include partition leader information, it is OK because we will always send MetadataUpdateRequest again when we send LeaderAndIsrRequest during replicaStateMachine.startup() and partitionStateMachine.startup().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)