You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Dong Lin (JIRA)" <ji...@apache.org> on 2016/11/25 04:03:59 UTC

[jira] [Created] (KAFKA-4442) Controller should grab lock when it is being initialized to avoid race condition

Dong Lin created KAFKA-4442:
-------------------------------

             Summary: Controller should grab lock when it is being initialized to avoid race condition
                 Key: KAFKA-4442
                 URL: https://issues.apache.org/jira/browse/KAFKA-4442
             Project: Kafka
          Issue Type: Bug
            Reporter: Dong Lin
            Assignee: Dong Lin


Currently controller will register broker change listener before sending send LeaderAndIsrRequests to live replicas. The call path looks like this:

- onControllerFailover()
  - partitionStateMachine.startup()
    - triggerOnlinePartitionStateChange()
      - handleStateChange(partition, OnlinePartition)
        - electLeaderForPartition(partition)
          - determines live replicas for this partition (step a)
          - add partition to controllerContext.partitionLeadershipInfo. (step b)
          - send LeaderAndIsrRequest to those live replics for this partition

However, if a broker registers itself in zookeeper in between step (a) and step (b), the onBrokerStartup() will not send LeaderAndIsrRequest to this broker for this partition because the partition is not found in controllerContext.partitionLeadershipInfo. Yet onControllerFailover() will not send LeaderAndIsrRequest to this broker for this partition either before the broker is not considered live in step (a).

The root cause is that onBrokerStartup() should only be executed after controller has finished onControllerFailover() and initialized its state. Therefore controller should grab the lock controllerContext.controllerLock during onControllerFailover().






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)