You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "John Fung (JIRA)" <ji...@apache.org> on 2012/10/28 01:33:12 UTC

[jira] [Created] (KAFKA-590) System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance

John Fung created KAFKA-590:
-------------------------------

             Summary: System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance
                 Key: KAFKA-590
                 URL: https://issues.apache.org/jira/browse/KAFKA-590
             Project: Kafka
          Issue Type: Bug
            Reporter: John Fung


1. Functional Test Area : Replication with Leader Hard Failure (1 Topic, 3 Partitions)

2. Testcases failed : 

0151 (Sync Producer, Acks = -1, No Compression)
0152 (Async Producer, Acks = -1, No Compression)
0155 (Sync Producer, Acks = -1, Compressed)
0156 (Async Producer, Acks = -1, Compressed)

3. Sample test results :

2012-10-25 18:22:20,206 - INFO - ======================================================
2012-10-25 18:22:20,206 - INFO - validating data matched
2012-10-25 18:22:20,206 - INFO - ======================================================
2012-10-25 18:22:20,206 - DEBUG - request-num-acks [-1] (kafka_system_test_utils)
2012-10-25 18:22:20,228 - INFO - no. of unique messages on topic [test_1] sent from publisher  : 900 (kafka_system_test_utils)
2012-10-25 18:22:20,235 - INFO - no. of unique messages on topic [test_1] at simple_consumer_1.log : 853 (kafka_system_test_utils)
2012-10-25 18:22:20,242 - INFO - no. of unique messages on topic [test_1] at simple_consumer_2.log : 853 (kafka_system_test_utils)
2012-10-25 18:22:20,247 - INFO - no. of unique messages on topic [test_1] at simple_consumer_3.log : 853 (kafka_system_test_utils)

4. Investigations :

a. Merge log segment files per partition:
Under test_1351181987/testcase_0151/logs/broker-1/kafka_server_1_logs:
cat test_1-0/00000000000000000000.log >> merged_test_1_0/00000000000000000000.log
cat test_1-0/00000000000000000197.log >> merged_test_1_0/00000000000000000000.log
. . .

b. Retrieve all CRC from merged data log segment:
bin/kafka-run-class.sh kafka.tools.DumpLogSegments merged_test_1_0/00000000000000000000.log | grep crc | sed 's/.* crc: //' | sort -u > test_1_0_crc.log
. . .

c. Merge the CRC files together:
cat test_1_0_crc.log >> all_crc.log
cat test_1_1_crc.log >> all_crc.log
cat test_1_2_crc.log >> all_crc.log

d. Sort the merged CRC file:
cat all_crc.log | sort -u > all_crc_sorted.log

e. Get the no. of 'failed to send' CRC in producer_performance.log (70 in this case):
grep 'failed to send' producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u | wc -l
70

f. Match those 'failed to send' CRC from producer_performance.log to see how many messages eventually got retried to send successfully:

$ for i in `grep 'failed to send' ../../producer_performance-4/producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u`; do echo -n "$i => "; grep $i all_crc_sorted.log || echo "n/a"; done;
. . .
1302684126 => n/a
1456125554 => 1456125554
15299643 => n/a
1653550869 => 1653550869
1741661084 => n/a
1764395211 => 1764395211
. . .
(23 msgs are sent successfully in retry)

g. As a result, (70 messages 'failed to send' in producer_performance.log - 23 messages successfully sent in retry) = 47 messages are lost (which matches the data loss count in the test result)

Therefore, if the no. of retry is increased to a higher value, all the messages could be sent successfully.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (KAFKA-590) System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung resolved KAFKA-590.
-----------------------------

    Resolution: Fixed

These 4 test cases are passing after setting "producer-retry-backoff-ms" to 2500 which is supported by ProducerPerformance (in KAFKA-267).
                
> System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-590
>                 URL: https://issues.apache.org/jira/browse/KAFKA-590
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>
> 1. Functional Test Area : Replication with Leader Hard Failure (1 Topic, 3 Partitions)
> 2. Testcases failed : 
> 0151 (Sync Producer, Acks = -1, No Compression)
> 0152 (Async Producer, Acks = -1, No Compression)
> 0155 (Sync Producer, Acks = -1, Compressed)
> 0156 (Async Producer, Acks = -1, Compressed)
> 3. Sample test results :
> 2012-10-25 18:22:20,206 - INFO - ======================================================
> 2012-10-25 18:22:20,206 - INFO - validating data matched
> 2012-10-25 18:22:20,206 - INFO - ======================================================
> 2012-10-25 18:22:20,206 - DEBUG - request-num-acks [-1] (kafka_system_test_utils)
> 2012-10-25 18:22:20,228 - INFO - no. of unique messages on topic [test_1] sent from publisher  : 900 (kafka_system_test_utils)
> 2012-10-25 18:22:20,235 - INFO - no. of unique messages on topic [test_1] at simple_consumer_1.log : 853 (kafka_system_test_utils)
> 2012-10-25 18:22:20,242 - INFO - no. of unique messages on topic [test_1] at simple_consumer_2.log : 853 (kafka_system_test_utils)
> 2012-10-25 18:22:20,247 - INFO - no. of unique messages on topic [test_1] at simple_consumer_3.log : 853 (kafka_system_test_utils)
> 4. Investigations :
> a. Merge log segment files per partition:
> Under test_1351181987/testcase_0151/logs/broker-1/kafka_server_1_logs:
> cat test_1-0/00000000000000000000.log >> merged_test_1_0/00000000000000000000.log
> cat test_1-0/00000000000000000197.log >> merged_test_1_0/00000000000000000000.log
> . . .
> b. Retrieve all CRC from merged data log segment:
> bin/kafka-run-class.sh kafka.tools.DumpLogSegments merged_test_1_0/00000000000000000000.log | grep crc | sed 's/.* crc: //' | sort -u > test_1_0_crc.log
> . . .
> c. Merge the CRC files together:
> cat test_1_0_crc.log >> all_crc.log
> cat test_1_1_crc.log >> all_crc.log
> cat test_1_2_crc.log >> all_crc.log
> d. Sort the merged CRC file:
> cat all_crc.log | sort -u > all_crc_sorted.log
> e. Get the no. of 'failed to send' CRC in producer_performance.log (70 in this case):
> grep 'failed to send' producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u | wc -l
> 70
> f. Match those 'failed to send' CRC from producer_performance.log to see how many messages eventually got retried to send successfully:
> $ for i in `grep 'failed to send' ../../producer_performance-4/producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u`; do echo -n "$i => "; grep $i all_crc_sorted.log || echo "n/a"; done;
> . . .
> 1302684126 => n/a
> 1456125554 => 1456125554
> 15299643 => n/a
> 1653550869 => 1653550869
> 1741661084 => n/a
> 1764395211 => 1764395211
> . . .
> (23 msgs are sent successfully in retry)
> g. As a result, (70 messages 'failed to send' in producer_performance.log - 23 messages successfully sent in retry) = 47 messages are lost (which matches the data loss count in the test result)
> Therefore, if the no. of retry is increased to a higher value, all the messages could be sent successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (KAFKA-590) System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance

Posted by "John Fung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung closed KAFKA-590.
---------------------------

    
> System Test - 4 cases failed due to insufficient no. of retry in ProducerPerformance
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-590
>                 URL: https://issues.apache.org/jira/browse/KAFKA-590
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: John Fung
>
> 1. Functional Test Area : Replication with Leader Hard Failure (1 Topic, 3 Partitions)
> 2. Testcases failed : 
> 0151 (Sync Producer, Acks = -1, No Compression)
> 0152 (Async Producer, Acks = -1, No Compression)
> 0155 (Sync Producer, Acks = -1, Compressed)
> 0156 (Async Producer, Acks = -1, Compressed)
> 3. Sample test results :
> 2012-10-25 18:22:20,206 - INFO - ======================================================
> 2012-10-25 18:22:20,206 - INFO - validating data matched
> 2012-10-25 18:22:20,206 - INFO - ======================================================
> 2012-10-25 18:22:20,206 - DEBUG - request-num-acks [-1] (kafka_system_test_utils)
> 2012-10-25 18:22:20,228 - INFO - no. of unique messages on topic [test_1] sent from publisher  : 900 (kafka_system_test_utils)
> 2012-10-25 18:22:20,235 - INFO - no. of unique messages on topic [test_1] at simple_consumer_1.log : 853 (kafka_system_test_utils)
> 2012-10-25 18:22:20,242 - INFO - no. of unique messages on topic [test_1] at simple_consumer_2.log : 853 (kafka_system_test_utils)
> 2012-10-25 18:22:20,247 - INFO - no. of unique messages on topic [test_1] at simple_consumer_3.log : 853 (kafka_system_test_utils)
> 4. Investigations :
> a. Merge log segment files per partition:
> Under test_1351181987/testcase_0151/logs/broker-1/kafka_server_1_logs:
> cat test_1-0/00000000000000000000.log >> merged_test_1_0/00000000000000000000.log
> cat test_1-0/00000000000000000197.log >> merged_test_1_0/00000000000000000000.log
> . . .
> b. Retrieve all CRC from merged data log segment:
> bin/kafka-run-class.sh kafka.tools.DumpLogSegments merged_test_1_0/00000000000000000000.log | grep crc | sed 's/.* crc: //' | sort -u > test_1_0_crc.log
> . . .
> c. Merge the CRC files together:
> cat test_1_0_crc.log >> all_crc.log
> cat test_1_1_crc.log >> all_crc.log
> cat test_1_2_crc.log >> all_crc.log
> d. Sort the merged CRC file:
> cat all_crc.log | sort -u > all_crc_sorted.log
> e. Get the no. of 'failed to send' CRC in producer_performance.log (70 in this case):
> grep 'failed to send' producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u | wc -l
> 70
> f. Match those 'failed to send' CRC from producer_performance.log to see how many messages eventually got retried to send successfully:
> $ for i in `grep 'failed to send' ../../producer_performance-4/producer_performance.log | sed 's/.* crc = //' | sed 's/, key = null.*//' | sort -u`; do echo -n "$i => "; grep $i all_crc_sorted.log || echo "n/a"; done;
> . . .
> 1302684126 => n/a
> 1456125554 => 1456125554
> 15299643 => n/a
> 1653550869 => 1653550869
> 1741661084 => n/a
> 1764395211 => 1764395211
> . . .
> (23 msgs are sent successfully in retry)
> g. As a result, (70 messages 'failed to send' in producer_performance.log - 23 messages successfully sent in retry) = 47 messages are lost (which matches the data loss count in the test result)
> Therefore, if the no. of retry is increased to a higher value, all the messages could be sent successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira