You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2013/03/06 18:42:13 UTC

[jira] [Resolved] (KAFKA-513) Add state change log to Kafka brokers

     [ https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede resolved KAFKA-513.
---------------------------------

    Resolution: Fixed

Thanks for patch v5 Swapnil, the state change log looks great! I checked in that patch after the following minor changes -

1. Partition
Fixed the error trace in makeFollower to include correlationId, controllerId and controllerEpoch

2. PartitionStateMachine
2.1 In initializeLeaderAndIsrForPartition(),
Changed NEW -> New
Changed ONLINE -> Online

2.2 In electLeaderForPartition(),
Removed the duplicate "Controller %d epoch %d"

2.3 Removed the error from getLeaderIsrAndEpochOrThrowException() in state change log since it is already logged in the catch block of electLeaderForPartition()

2.4 Changed trace() to error() wherever required

3. ReplicaStateMachine
Included an error statement in the state change log
                
> Add state change log to Kafka brokers
> -------------------------------------
>
>                 Key: KAFKA-513
>                 URL: https://issues.apache.org/jira/browse/KAFKA-513
>             Project: Kafka
>          Issue Type: Sub-task
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Swapnil Ghike
>            Priority: Blocker
>              Labels: p1, replication, tools
>             Fix For: 0.8
>
>         Attachments: kafka-513-v1.patch, kafka-513-v2.patch, kafka-513-v3.patch, kafka-513-v4.patch, kafka-513-v5-corrected.patch, kafka-513-v5.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Once KAFKA-499 is checked in, every controller to broker communication can be modelled as a state change for one or more partitions. Every state change request will carry the controller epoch. If there is a problem with the state of some partitions, it will be good to have a tool that can create a timeline of requested and completed state changes. This will require each broker to output a state change log that has entries like
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for partition [foo, 0] from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for partition [foo, 0] from controller 2, epoch 1
> On controller, this will look like -
> [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change request LeaderAndIsr() for partition [foo, 0]
> We need a tool that can collect the state change log from all brokers and create a per-partition timeline of state changes -
> [foo, 0]
> [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change request LeaderAndIsr() 
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from controller 2, epoch 1
> This JIRA involves adding the state change log to each broker and adding the tool to create the timeline 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira