You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "John Fung (JIRA)" <ji...@apache.org> on 2012/08/13 19:28:38 UTC

[jira] [Created] (KAFKA-457) Leader Re-election is broken in rev. 1368092

John Fung created KAFKA-457:
-------------------------------

             Summary: Leader Re-election is broken in rev. 1368092
                 Key: KAFKA-457
                 URL: https://issues.apache.org/jira/browse/KAFKA-457
             Project: Kafka
          Issue Type: Bug
            Reporter: John Fung
            Assignee: Yang Ye
            Priority: Blocker


Steps to reproduce:

1. Check out rev. 1368092 and execute the command: $ ./sbt update package

2. Replace <kafka_home>/system_test/single_host_multi_brokers/bin/run-test.sh with the attached script because the logging message related to "shut down completed" and "leader state transition" has been modified in this revision.

3. Execute the command bin/run-test.sh

4. The test seems to be hung.

5. The reason is that there is no leader re-election happening after the first leader is terminated. By checking the logs, there may be only 1 "completed the leader state transition" log message found in the server logs:

$ grep "completed the leader state transition" *
kafka_server_3.log:[2012-08-13 10:07:58,085] INFO Replica Manager on Broker 3, completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)

6. The correct behavior as in rev 1367821, re-election log messages are found after leader termination as follows:
$ grep "completed the leader state transition" *
kafka_server_1.log:[2012-08-13 09:40:23,542] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
kafka_server_1.log:[2012-08-13 09:42:17,881] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
kafka_server_2.log:[2012-08-13 09:40:29,082] INFO Broker 2 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
kafka_server_3.log:[2012-08-13 09:44:06,695] INFO Broker 3 completed the leader state transition for topic mytest part



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-457) Leader Re-election is broken in rev. 1368092

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434714#comment-13434714 ] 

Yang Ye commented on KAFKA-457:
-------------------------------

The behaviour of the code is changed after kafka-369 is checked in, and the system tests all work well. So this jira should be marked as resolved.
                
> Leader Re-election is broken in rev. 1368092
> --------------------------------------------
>
>                 Key: KAFKA-457
>                 URL: https://issues.apache.org/jira/browse/KAFKA-457
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>            Assignee: Yang Ye
>            Priority: Blocker
>         Attachments: run-test.sh
>
>
> Steps to reproduce:
> 1. Check out rev. 1368092 and execute the command: $ ./sbt update package
> 2. Replace <kafka_home>/system_test/single_host_multi_brokers/bin/run-test.sh with the attached script because the logging message related to "shut down completed" and "leader state transition" has been modified in this revision.
> 3. Execute the command bin/run-test.sh
> 4. The test seems to be hung.
> 5. The reason is that there is no leader re-election happening after the first leader is terminated. By checking the logs, there may be only 1 "completed the leader state transition" log message found in the server logs:
> $ grep "completed the leader state transition" *
> kafka_server_3.log:[2012-08-13 10:07:58,085] INFO Replica Manager on Broker 3, completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> 6. The correct behavior as in rev 1367821, re-election log messages are found after leader termination as follows:
> $ grep "completed the leader state transition" *
> kafka_server_1.log:[2012-08-13 09:40:23,542] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_1.log:[2012-08-13 09:42:17,881] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_2.log:[2012-08-13 09:40:29,082] INFO Broker 2 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_3.log:[2012-08-13 09:44:06,695] INFO Broker 3 completed the leader state transition for topic mytest part

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-457) Leader Re-election is broken in rev. 1368092

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-457:
----------------------------

    Attachment: run-test.sh

This script "run-test.sh" contains the following changes
1. KAFKA-380-v5.patch
2. Updated log message patterns for looking up leader election in revision 1368092
                
> Leader Re-election is broken in rev. 1368092
> --------------------------------------------
>
>                 Key: KAFKA-457
>                 URL: https://issues.apache.org/jira/browse/KAFKA-457
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>            Assignee: Yang Ye
>            Priority: Blocker
>         Attachments: run-test.sh
>
>
> Steps to reproduce:
> 1. Check out rev. 1368092 and execute the command: $ ./sbt update package
> 2. Replace <kafka_home>/system_test/single_host_multi_brokers/bin/run-test.sh with the attached script because the logging message related to "shut down completed" and "leader state transition" has been modified in this revision.
> 3. Execute the command bin/run-test.sh
> 4. The test seems to be hung.
> 5. The reason is that there is no leader re-election happening after the first leader is terminated. By checking the logs, there may be only 1 "completed the leader state transition" log message found in the server logs:
> $ grep "completed the leader state transition" *
> kafka_server_3.log:[2012-08-13 10:07:58,085] INFO Replica Manager on Broker 3, completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> 6. The correct behavior as in rev 1367821, re-election log messages are found after leader termination as follows:
> $ grep "completed the leader state transition" *
> kafka_server_1.log:[2012-08-13 09:40:23,542] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_1.log:[2012-08-13 09:42:17,881] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_2.log:[2012-08-13 09:40:29,082] INFO Broker 2 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_3.log:[2012-08-13 09:44:06,695] INFO Broker 3 completed the leader state transition for topic mytest part

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (KAFKA-457) Leader Re-election is broken in rev. 1368092

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung resolved KAFKA-457.
-----------------------------

    Resolution: Fixed

Hi Victor,

Thank you for your fix. I have tested rev. 1373633 for system_test/single_host_multi_brokers/run-test.sh and it is working correctly.

Regards,
John
                
> Leader Re-election is broken in rev. 1368092
> --------------------------------------------
>
>                 Key: KAFKA-457
>                 URL: https://issues.apache.org/jira/browse/KAFKA-457
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>            Assignee: Yang Ye
>            Priority: Blocker
>         Attachments: run-test.sh
>
>
> Steps to reproduce:
> 1. Check out rev. 1368092 and execute the command: $ ./sbt update package
> 2. Replace <kafka_home>/system_test/single_host_multi_brokers/bin/run-test.sh with the attached script because the logging message related to "shut down completed" and "leader state transition" has been modified in this revision.
> 3. Execute the command bin/run-test.sh
> 4. The test seems to be hung.
> 5. The reason is that there is no leader re-election happening after the first leader is terminated. By checking the logs, there may be only 1 "completed the leader state transition" log message found in the server logs:
> $ grep "completed the leader state transition" *
> kafka_server_3.log:[2012-08-13 10:07:58,085] INFO Replica Manager on Broker 3, completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> 6. The correct behavior as in rev 1367821, re-election log messages are found after leader termination as follows:
> $ grep "completed the leader state transition" *
> kafka_server_1.log:[2012-08-13 09:40:23,542] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_1.log:[2012-08-13 09:42:17,881] INFO Broker 1 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_2.log:[2012-08-13 09:40:29,082] INFO Broker 2 completed the leader state transition for topic mytest partition 0 (kafka.server.ReplicaManager)
> kafka_server_3.log:[2012-08-13 09:44:06,695] INFO Broker 3 completed the leader state transition for topic mytest part

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira