You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "John Fung (JIRA)" <ji...@apache.org> on 2012/10/11 19:09:03 UTC

[jira] [Created] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

John Fung created KAFKA-567:
-------------------------------

             Summary: Replication Data Loss in Mirror Maker Bouncing testcase
                 Key: KAFKA-567
                 URL: https://issues.apache.org/jira/browse/KAFKA-567
             Project: Kafka
          Issue Type: Bug
            Reporter: John Fung
         Attachments: kafka-mirror-maker-bouncing-data-loss.patch

* To reproduce this issue, please do the followings:

1. Download the latest 0.8 branch
2. Apply the patch attached to this JIRA
3. Build kafka by running "./sbt update package"
4. Execute the test in directory "system_test" : "python -B system_test_runner.py"

* The test result may look like the following:

_test_case_name  :  testcase_5002
_test_class_name  :  MirrorMakerTest
arg : bounce_leader  :  false
arg : bounce_mirror_maker  :  true
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  1
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  1
arg : replica_factor  :  3
arg : sleep_seconds_between_producer_calls  :  1
validation_status  : 
     Log segment checksum matching across all replicas  :  FAILED
     Unique messages from consumer on [test_1]  :  355
     Unique messages from producer on [test_1]  :  400
     Validate for data matched on topic [test_1]  :  FAILED

* Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.

* There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-567:
----------------------------

    Attachment: mirror_maker_12.log
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-567:
----------------------------

    Attachment: system_test_1349971807_testcase_5002.tar
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, system_test_1349971807_testcase_5002.tar
>
>
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-567:
-----------------------------

    Attachment: KAFKA-567-mirrormakershutdown-v1.patch
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475126#comment-13475126 ] 

Joel Koshy commented on KAFKA-567:
----------------------------------

+1 for kafka-567.patch

Also, right now mirror maker does not always shutdown cleanly. E.g., if you want to use consumer.timeout.ms (and avoid having to do an external wait for mirroring to finish). John, I'll attach a patch for that as well if it helps the testing framework.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Swapnil Ghike (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475327#comment-13475327 ] 

Swapnil Ghike commented on KAFKA-567:
-------------------------------------

Thanks for kafka-567.patch! I was also hitting this bug with the container change.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475153#comment-13475153 ] 

Neha Narkhede commented on KAFKA-567:
-------------------------------------

+1 on KAFKA-567.patch, good catch !
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Swapnil Ghike (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Swapnil Ghike updated KAFKA-567:
--------------------------------

    Comment: was deleted

(was: Thanks for kafka-567.patch! I was also hitting this bug with the container change.)
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475094#comment-13475094 ] 

John Fung commented on KAFKA-567:
---------------------------------

Thanks Jun for the patch. The fix is tested with the latest 0.8 branch (r.1397616) by apply both your patch and the patch to reproduce the issue. The following is the test result: (Please ignore the checksum validating failure for the time being)

_test_case_name  :  testcase_5002
_test_class_name  :  MirrorMakerTest
arg : bounce_leader  :  false
arg : bounce_mirror_maker  :  true
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  1
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  1
arg : replica_factor  :  3
arg : sleep_seconds_between_producer_calls  :  1
validation_status  : 
     Log segment checksum matching across all replicas  :  FAILED
     Unique messages from consumer on [test_1]  :  400
     Unique messages from producer on [test_1]  :  400
     Validate for data matched on topic [test_1]  :  PASSED

                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-567:
--------------------------

    Status: Patch Available  (was: Open)
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-567:
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.8
           Status: Resolved  (was: Patch Available)

Thanks for the reviews. Committed to 0.8.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474456#comment-13474456 ] 

John Fung commented on KAFKA-567:
---------------------------------

Uploaded mirror_maker_12.log for a similar test but with the following changes:

11. After Mirror Maker is bounced and wait 15 seconds, Producer (sending data in the source cluster) is stopped. Then wait for 60 seconds more to make sure that Mirror Maker is able to catch up.

12. kafka.consumer.ConsumerIterator is configured to be at TRACE level in log4j.properties

13. kafka.producer.Producer is also configured to be at TRACE level.

14. The attached mirror maker log file shows that there are 400 messages each showing in the TRACE level messages which is matching the Producer data count but Target consumer is still missing some messages.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474435#comment-13474435 ] 

Neha Narkhede commented on KAFKA-567:
-------------------------------------

We should probably increase time outs after knowing why the test is running slower. Increasing timeouts can hide performance regressions.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao reassigned KAFKA-567:
-----------------------------

    Assignee: Jun Rao
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>              Labels: bugs
>         Attachments: kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-567:
--------------------------

          Component/s: core
             Priority: Blocker  (was: Major)
    Affects Version/s: 0.8
               Labels: bugs  (was: )
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>         Attachments: kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-567:
----------------------------

    Description: 
* Test Description: 
1. Start a 3-broker cluster as source
2. Start a 3-broker cluster as target
3. Start 1 instance of Mirror Maker to replicate data from source to target
4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
5. Start a consumer to consume data from target cluster.
6. Compare the MessageID in the data between producer log and consumer log.

* To reproduce this issue, please do the followings:

1. Download the latest 0.8 branch
2. Apply the patch attached to this JIRA
3. Build kafka by running "./sbt update package"
4. Execute the test in directory "system_test" : "python -B system_test_runner.py"

* The test result may look like the following:

_test_case_name  :  testcase_5002
_test_class_name  :  MirrorMakerTest
arg : bounce_leader  :  false
arg : bounce_mirror_maker  :  true
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  1
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  1
arg : replica_factor  :  3
arg : sleep_seconds_between_producer_calls  :  1
validation_status  : 
     Log segment checksum matching across all replicas  :  FAILED
     Unique messages from consumer on [test_1]  :  355
     Unique messages from producer on [test_1]  :  400
     Validate for data matched on topic [test_1]  :  FAILED

* Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.

* There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

  was:
* To reproduce this issue, please do the followings:

1. Download the latest 0.8 branch
2. Apply the patch attached to this JIRA
3. Build kafka by running "./sbt update package"
4. Execute the test in directory "system_test" : "python -B system_test_runner.py"

* The test result may look like the following:

_test_case_name  :  testcase_5002
_test_class_name  :  MirrorMakerTest
arg : bounce_leader  :  false
arg : bounce_mirror_maker  :  true
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  1
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  1
arg : replica_factor  :  3
arg : sleep_seconds_between_producer_calls  :  1
validation_status  : 
     Log segment checksum matching across all replicas  :  FAILED
     Unique messages from consumer on [test_1]  :  355
     Unique messages from producer on [test_1]  :  400
     Validate for data matched on topic [test_1]  :  FAILED

* Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.

* There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao closed KAFKA-567.
-------------------------

    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>            Assignee: Jun Rao
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>         Attachments: KAFKA-567-mirrormakershutdown-v1.patch, kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-567:
--------------------------

    Attachment: kafka-567.patch

Attach a patch. The problem is that async producer didn't drain the last batch of events on close, a bug introduced during producer refactoring. Now, the test has no data loss.
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>              Labels: bugs
>         Attachments: kafka-567.patch, kafka-mirror-maker-bouncing-data-loss.patch, mirror_maker_12.log, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-567:
----------------------------

    Attachment: kafka-mirror-maker-bouncing-data-loss.patch
    
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch
>
>
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-567) Replication Data Loss in Mirror Maker Bouncing testcase

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474428#comment-13474428 ] 

Jun Rao commented on KAFKA-567:
-------------------------------

Took a quick look.

Log in source broker:
system_test_1349971807_testcase_5002 jrao$ ls kafka_server_4_logs/test_1-0/*.log
kafka_server_4_logs/test_1-0/00000000000000000000.log  kafka_server_4_logs/test_1-0/00000000000000000140.log  kafka_server_4_logs/test_1-0/00000000000000000280.log
kafka_server_4_logs/test_1-0/00000000000000000020.log  kafka_server_4_logs/test_1-0/00000000000000000160.log  kafka_server_4_logs/test_1-0/00000000000000000300.log
kafka_server_4_logs/test_1-0/00000000000000000040.log  kafka_server_4_logs/test_1-0/00000000000000000180.log  kafka_server_4_logs/test_1-0/00000000000000000320.log
kafka_server_4_logs/test_1-0/00000000000000000060.log  kafka_server_4_logs/test_1-0/00000000000000000200.log  kafka_server_4_logs/test_1-0/00000000000000000340.log
kafka_server_4_logs/test_1-0/00000000000000000080.log  kafka_server_4_logs/test_1-0/00000000000000000220.log  kafka_server_4_logs/test_1-0/00000000000000000360.log
kafka_server_4_logs/test_1-0/00000000000000000100.log  kafka_server_4_logs/test_1-0/00000000000000000240.log  kafka_server_4_logs/test_1-0/00000000000000000380.log
kafka_server_4_logs/test_1-0/00000000000000000120.log  kafka_server_4_logs/test_1-0/00000000000000000260.log

Log in target broker
system_test_1349971807_testcase_5002 jrao$ ls kafka_server_7_logs/test_1-0/*.log
kafka_server_7_logs/test_1-0/00000000000000000000.log  kafka_server_7_logs/test_1-0/00000000000000000185.log  kafka_server_7_logs/test_1-0/00000000000000000312.log
kafka_server_7_logs/test_1-0/00000000000000000100.log  kafka_server_7_logs/test_1-0/00000000000000000235.log
kafka_server_7_logs/test_1-0/00000000000000000155.log  kafka_server_7_logs/test_1-0/00000000000000000262.log

It seems that the test didn't give mirror maker enough time to catch up data after bouncing. Could you change the test to wait a bit longer after bounce?
                
> Replication Data Loss in Mirror Maker Bouncing testcase
> -------------------------------------------------------
>
>                 Key: KAFKA-567
>                 URL: https://issues.apache.org/jira/browse/KAFKA-567
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>         Attachments: kafka-mirror-maker-bouncing-data-loss.patch, system_test_1349971807_testcase_5002.tar
>
>
> * Test Description: 
> 1. Start a 3-broker cluster as source
> 2. Start a 3-broker cluster as target
> 3. Start 1 instance of Mirror Maker to replicate data from source to target
> 4. While producer is sending data into source cluster, stop Mirror Maker with "kill -15". Start Mirror Maker again after 1 second.
> 5. Start a consumer to consume data from target cluster.
> 6. Compare the MessageID in the data between producer log and consumer log.
> * To reproduce this issue, please do the followings:
> 1. Download the latest 0.8 branch
> 2. Apply the patch attached to this JIRA
> 3. Build kafka by running "./sbt update package"
> 4. Execute the test in directory "system_test" : "python -B system_test_runner.py"
> * The test result may look like the following:
> _test_case_name  :  testcase_5002
> _test_class_name  :  MirrorMakerTest
> arg : bounce_leader  :  false
> arg : bounce_mirror_maker  :  true
> arg : message_producing_free_time_sec  :  15
> arg : num_iteration  :  1
> arg : num_messages_to_produce_per_producer_call  :  50
> arg : num_partition  :  1
> arg : replica_factor  :  3
> arg : sleep_seconds_between_producer_calls  :  1
> validation_status  : 
>      Log segment checksum matching across all replicas  :  FAILED
>      Unique messages from consumer on [test_1]  :  355
>      Unique messages from producer on [test_1]  :  400
>      Validate for data matched on topic [test_1]  :  FAILED
> * Attached a tar file for the system test output log, the brokers' log4j files and data log segment files.
> * There are no unusual Exception / Error found in the logs. However, there are consistently data loss in this Mirror Maker bouncing test case. Not sure if this is related to KAFKA-552.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira