You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "John Fung (JIRA)" <ji...@apache.org> on 2012/06/26 17:48:44 UTC

[jira] [Resolved] (KAFKA-372) Consumer doesn't receive all data if there are multiple segment files

     [ https://issues.apache.org/jira/browse/KAFKA-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung resolved KAFKA-372.
-----------------------------

    Resolution: Fixed

Thanks Jun. It is working correctly after applying kafka-372-v1.patch.
                
> Consumer doesn't receive all data if there are multiple segment files
> ---------------------------------------------------------------------
>
>                 Key: KAFKA-372
>                 URL: https://issues.apache.org/jira/browse/KAFKA-372
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: John Fung
>         Attachments: kafka-372_v1.patch, multi_seg_files_data_loss_debug.patch
>
>
> This issue happens inconsistently but could be reproduced by following the steps below (repeat step 4 a few times to reproduce it):
> 1. Check out 0.8 branch (currently reproducible with rev. 1352634)
> 2. Apply kafka-306-v4.patch
> 3. Please note that the log.file.size is set to 10000000 in system_test/broker_failure/config/server_*.properties (small enough to trigger multi segment files)
> 4. Under the directory <kafka home>/system_test/broker_failure, execute command:
> $ bin/run-test.sh 20 0
> 5. After the test is completed, the result will probably look like the following:
> ========================================================
> no. of messages published            : 14000
> producer unique msg rec'd            : 14000
> source consumer msg rec'd            : 7271
> source consumer unique msg rec'd     : 7271
> mirror consumer msg rec'd            : 6960
> mirror consumer unique msg rec'd     : 6960
> total source/mirror duplicate msg    : 0
> source/mirror uniq msg count diff    : 311
> ========================================================
> 6. By checking the kafka log files, the sum of the sizes of the source cluster segments files are equal to those in the target cluster.
> [/tmp] $  find kafka* -name *.kafka -ls
> 18620155 9860 -rw-r--r--   1 jfung    eng      10096535 Jun 21 11:09 kafka-source3-logs/test01-0/00000000000000000000.kafka
> 18620161 9772 -rw-r--r--   1 jfung    eng      10004418 Jun 21 11:11 kafka-source3-logs/test01-0/00000000000020105286.kafka
> 18620160 9776 -rw-r--r--   1 jfung    eng      10008751 Jun 21 11:10 kafka-source3-logs/test01-0/00000000000010096535.kafka
> 18620162 4708 -rw-r--r--   1 jfung    eng       4819067 Jun 21 11:11 kafka-source3-logs/test01-0/00000000000030109704.kafka
> 19406431 9920 -rw-r--r--   1 jfung    eng      10157685 Jun 21 11:10 kafka-target2-logs/test01-0/00000000000010335039.kafka
> 19406429 10096 -rw-r--r--   1 jfung    eng      10335039 Jun 21 11:09 kafka-target2-logs/test01-0/00000000000000000000.kafka
> 19406432 10300 -rw-r--r--   1 jfung    eng      10544850 Jun 21 11:11 kafka-target2-logs/test01-0/00000000000020492724.kafka
> 19406433 3800 -rw-r--r--   1 jfung    eng       3891197 Jun 21 11:12 kafka-target2-logs/test01-0/00000000000031037574.kafka
> 7. If the log.file.size in target cluster is configured to a very large value such that there is only 1 data file, the result would look like this:
> ========================================================
> no. of messages published            : 14000
> producer unique msg rec'd            : 14000
> source consumer msg rec'd            : 7302
> source consumer unique msg rec'd     : 7302
> mirror consumer msg rec'd            : 13750
> mirror consumer unique msg rec'd     : 13750
> total source/mirror duplicate msg    : 0
> source/mirror uniq msg count diff    : -6448
> ========================================================
> 8. The log files are like these:
> [/tmp] $ find kafka* -name *.kafka -ls
> 18620160 9840 -rw-r--r--   1 jfung    eng      10075058 Jun 21 11:24 kafka-source2-logs/test01-0/00000000000010083679.kafka
> 18620155 9848 -rw-r--r--   1 jfung    eng      10083679 Jun 21 11:23 kafka-source2-logs/test01-0/00000000000000000000.kafka
> 18620162 4484 -rw-r--r--   1 jfung    eng       4589474 Jun 21 11:26 kafka-source2-logs/test01-0/00000000000030269045.kafka
> 18620161 9876 -rw-r--r--   1 jfung    eng      10110308 Jun 21 11:25 kafka-source2-logs/test01-0/00000000000020158737.kafka
> 19406429 34048 -rw-r--r--   1 jfung    eng      34858519 Jun 21 11:26 kafka-target3-logs/test01-0/00000000000000000000.kafka

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira