You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2012/09/24 18:37:07 UTC

[jira] [Created] (KAFKA-525) newly created partitions are not added to ReplicaStateMachine

Jun Rao created KAFKA-525:
-----------------------------

             Summary: newly created partitions are not added to ReplicaStateMachine
                 Key: KAFKA-525
                 URL: https://issues.apache.org/jira/browse/KAFKA-525
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.8
            Reporter: Jun Rao
            Priority: Blocker


Saw the following error in a run of system test. It seems that we never add the replicas of newly created topic to ReplicaStateMachine.

[2012-09-23 14:34:46,707] INFO [Controller 1], Broker failure callback for 2 (kafka.controller.KafkaController)
[2012-09-23 14:34:46,724] INFO [Partition state machine on Controller 1]: Invoking state change to OfflinePartition for partitions (test_1,0) (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from Online to Offline (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Electing leader for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,735] INFO [Partition state machine on Controller 1]: New leader and ISR for partition [test_1, 0] is { "ISR": "3,1","leader": "3","leaderEpoch": "1" } (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,783] INFO Conditional update the zkPath /brokers/topics/test_1/partitions/0/leaderAndISR with expected version 0 succeed and return the new version: 1 (kafka.utils.ZkUtils$)
[2012-09-23 14:34:46,783] INFO [Partition state machine on Controller 1]: Elected leader 3 for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,784] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from OfflinePartition to Online with leader 3 (kafka.controller.PartitionStateMachine)
[2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 1 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
stBatch)
[2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 3 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
stBatch)
[2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Handling leader and isr request LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.server.ReplicaMa
nager)
[2012-09-23 14:34:46,786] INFO [Replica state machine on Controller 1]: Invoking state change to OfflineReplica for brokers 2 (kafka.controller.ReplicaStateMachine)
[2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.server.ReplicaManager)
[2012-09-23 14:34:46,786] INFO Partition [test_1, 0] on broker 1, Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.cluster.Partition)
[2012-09-23 14:34:46,788] INFO [ReplicaFetcherManager on broker 1, ], removing fetcher on topic test_1, partition 0 (kafka.server.ReplicaFetcherManager)
[2012-09-23 14:34:46,788] INFO [ReplicaFetcherThread-2-0-on-broker-1], Shutting down (kafka.server.ReplicaFetcherThread)
[2012-09-23 14:34:46,789] ERROR [Replica state machine on Controller 1]: Error while changing state of replica 2 for partition [test_1, 0] to OfflineReplica (kafka.controller.ReplicaStateMachine)
java.util.NoSuchElementException: key not found: (test_1,0,2)
        at scala.collection.MapLike$class.default(MapLike.scala:223)
        at scala.collection.mutable.HashMap.default(HashMap.scala:39)
        at scala.collection.MapLike$class.apply(MapLike.scala:134)
        at scala.collection.mutable.HashMap.apply(HashMap.scala:39)
        at kafka.controller.ReplicaStateMachine.assertValidPreviousStates(ReplicaStateMachine.scala:162)
        at kafka.controller.ReplicaStateMachine.kafka$controller$ReplicaStateMachine$$handleStateChange(ReplicaStateMachine.scala:124)
        at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:84)
        at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:83)
        at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
        at scala.collection.immutable.List.foreach(List.scala:45)
        at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply$mcVI$sp(ReplicaStateMachine.scala:83)
        at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
        at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
        at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
        at scala.collection.immutable.List.foreach(List.scala:45)
        at kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:79)
        at kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:124)
        at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.liftedTree1$1(ReplicaStateMachine.scala:217)
        at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:203)
        at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
        at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
        at kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:199)
        at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (KAFKA-525) newly created partitions are not added to ReplicaStateMachine

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao resolved KAFKA-525.
---------------------------

       Resolution: Duplicate
    Fix Version/s: 0.8
    
> newly created partitions are not added to ReplicaStateMachine
> -------------------------------------------------------------
>
>                 Key: KAFKA-525
>                 URL: https://issues.apache.org/jira/browse/KAFKA-525
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Saw the following error in a run of system test. It seems that we never add the replicas of newly created topic to ReplicaStateMachine.
> [2012-09-23 14:34:46,707] INFO [Controller 1], Broker failure callback for 2 (kafka.controller.KafkaController)
> [2012-09-23 14:34:46,724] INFO [Partition state machine on Controller 1]: Invoking state change to OfflinePartition for partitions (test_1,0) (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from Online to Offline (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Electing leader for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,735] INFO [Partition state machine on Controller 1]: New leader and ISR for partition [test_1, 0] is { "ISR": "3,1","leader": "3","leaderEpoch": "1" } (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,783] INFO Conditional update the zkPath /brokers/topics/test_1/partitions/0/leaderAndISR with expected version 0 succeed and return the new version: 1 (kafka.utils.ZkUtils$)
> [2012-09-23 14:34:46,783] INFO [Partition state machine on Controller 1]: Elected leader 3 for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,784] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from OfflinePartition to Online with leader 3 (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 1 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
> stBatch)
> [2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 3 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
> stBatch)
> [2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Handling leader and isr request LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.server.ReplicaMa
> nager)
> [2012-09-23 14:34:46,786] INFO [Replica state machine on Controller 1]: Invoking state change to OfflineReplica for brokers 2 (kafka.controller.ReplicaStateMachine)
> [2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.server.ReplicaManager)
> [2012-09-23 14:34:46,786] INFO Partition [test_1, 0] on broker 1, Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.cluster.Partition)
> [2012-09-23 14:34:46,788] INFO [ReplicaFetcherManager on broker 1, ], removing fetcher on topic test_1, partition 0 (kafka.server.ReplicaFetcherManager)
> [2012-09-23 14:34:46,788] INFO [ReplicaFetcherThread-2-0-on-broker-1], Shutting down (kafka.server.ReplicaFetcherThread)
> [2012-09-23 14:34:46,789] ERROR [Replica state machine on Controller 1]: Error while changing state of replica 2 for partition [test_1, 0] to OfflineReplica (kafka.controller.ReplicaStateMachine)
> java.util.NoSuchElementException: key not found: (test_1,0,2)
>         at scala.collection.MapLike$class.default(MapLike.scala:223)
>         at scala.collection.mutable.HashMap.default(HashMap.scala:39)
>         at scala.collection.MapLike$class.apply(MapLike.scala:134)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:39)
>         at kafka.controller.ReplicaStateMachine.assertValidPreviousStates(ReplicaStateMachine.scala:162)
>         at kafka.controller.ReplicaStateMachine.kafka$controller$ReplicaStateMachine$$handleStateChange(ReplicaStateMachine.scala:124)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:84)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:83)
>         at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>         at scala.collection.immutable.List.foreach(List.scala:45)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply$mcVI$sp(ReplicaStateMachine.scala:83)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
>         at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>         at scala.collection.immutable.List.foreach(List.scala:45)
>         at kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:79)
>         at kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:124)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.liftedTree1$1(ReplicaStateMachine.scala:217)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:203)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:199)
>         at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
>         at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (KAFKA-525) newly created partitions are not added to ReplicaStateMachine

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao closed KAFKA-525.
-------------------------


Fixed in kafka-42.
                
> newly created partitions are not added to ReplicaStateMachine
> -------------------------------------------------------------
>
>                 Key: KAFKA-525
>                 URL: https://issues.apache.org/jira/browse/KAFKA-525
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Saw the following error in a run of system test. It seems that we never add the replicas of newly created topic to ReplicaStateMachine.
> [2012-09-23 14:34:46,707] INFO [Controller 1], Broker failure callback for 2 (kafka.controller.KafkaController)
> [2012-09-23 14:34:46,724] INFO [Partition state machine on Controller 1]: Invoking state change to OfflinePartition for partitions (test_1,0) (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from Online to Offline (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,725] INFO [Partition state machine on Controller 1]: Electing leader for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,735] INFO [Partition state machine on Controller 1]: New leader and ISR for partition [test_1, 0] is { "ISR": "3,1","leader": "3","leaderEpoch": "1" } (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,783] INFO Conditional update the zkPath /brokers/topics/test_1/partitions/0/leaderAndISR with expected version 0 succeed and return the new version: 1 (kafka.utils.ZkUtils$)
> [2012-09-23 14:34:46,783] INFO [Partition state machine on Controller 1]: Elected leader 3 for Offline partition [test_1, 0] (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,784] INFO [Partition state machine on Controller 1]: Partition [test_1, 0] state changed from OfflinePartition to Online with leader 3 (kafka.controller.PartitionStateMachine)
> [2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 1 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
> stBatch)
> [2012-09-23 14:34:46,785] INFO The leaderAndIsr request sent to broker 3 is LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.controller.ControllerBrokerReque
> stBatch)
> [2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Handling leader and isr request LeaderAndIsrRequest(1,,1000,Map((test_1,0) -> { "ISR": "3,1","leader": "3","leaderEpoch": "1" })) (kafka.server.ReplicaMa
> nager)
> [2012-09-23 14:34:46,786] INFO [Replica state machine on Controller 1]: Invoking state change to OfflineReplica for brokers 2 (kafka.controller.ReplicaStateMachine)
> [2012-09-23 14:34:46,786] INFO Replica Manager on Broker 1: Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.server.ReplicaManager)
> [2012-09-23 14:34:46,786] INFO Partition [test_1, 0] on broker 1, Starting the follower state transition to follow leader 3 for topic test_1 partition 0 (kafka.cluster.Partition)
> [2012-09-23 14:34:46,788] INFO [ReplicaFetcherManager on broker 1, ], removing fetcher on topic test_1, partition 0 (kafka.server.ReplicaFetcherManager)
> [2012-09-23 14:34:46,788] INFO [ReplicaFetcherThread-2-0-on-broker-1], Shutting down (kafka.server.ReplicaFetcherThread)
> [2012-09-23 14:34:46,789] ERROR [Replica state machine on Controller 1]: Error while changing state of replica 2 for partition [test_1, 0] to OfflineReplica (kafka.controller.ReplicaStateMachine)
> java.util.NoSuchElementException: key not found: (test_1,0,2)
>         at scala.collection.MapLike$class.default(MapLike.scala:223)
>         at scala.collection.mutable.HashMap.default(HashMap.scala:39)
>         at scala.collection.MapLike$class.apply(MapLike.scala:134)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:39)
>         at kafka.controller.ReplicaStateMachine.assertValidPreviousStates(ReplicaStateMachine.scala:162)
>         at kafka.controller.ReplicaStateMachine.kafka$controller$ReplicaStateMachine$$handleStateChange(ReplicaStateMachine.scala:124)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:84)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1$$anonfun$apply$mcVI$sp$1.apply(ReplicaStateMachine.scala:83)
>         at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>         at scala.collection.immutable.List.foreach(List.scala:45)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply$mcVI$sp(ReplicaStateMachine.scala:83)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
>         at kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$1.apply(ReplicaStateMachine.scala:79)
>         at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>         at scala.collection.immutable.List.foreach(List.scala:45)
>         at kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:79)
>         at kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:124)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.liftedTree1$1(ReplicaStateMachine.scala:217)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:203)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:199)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:199)
>         at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
>         at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira