You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Yohan Sanchez (JIRA)" <ji...@apache.org> on 2018/04/25 15:58:00 UTC

[jira] [Created] (KAFKA-6827) Messages stuck after broker's multiple restart in a row

Yohan Sanchez created KAFKA-6827:
------------------------------------

             Summary: Messages stuck after broker's multiple restart in a row
                 Key: KAFKA-6827
                 URL: https://issues.apache.org/jira/browse/KAFKA-6827
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 1.1.0, 0.10.2.0
            Reporter: Yohan Sanchez
         Attachments: kafka1.log, kafka2.log, producer.prop

Hello :)

Tried with v0.10.2 and 1.1.0.
 I start with brand new brokers with no old data.

Created topic test
{code:java}
/usr/share/kafka/bin/kafka-topics.sh --zookeeper $ZOOKEEPER --create --topic test --partitions 1 --replication-factor 2 --config retention.ms=604800000 && /usr/share/kafka/bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Created topic "test".
Topic:test      PartitionCount:1        ReplicationFactor:2     Configs:retention.ms=604800000
        Topic: test     Partition: 0    Leader: 2       Replicas: 2,1   Isr: 2,1{code}
 
 Logs from brokers available in attachment

I start producing with a verifiable producer
{code:java}
/usr/share/kafka/bin/kafka-verifiable-producer.sh --topic test --broker-list localhost:9091 --max-messages 1000000 --throughput 10000 --producer.config producer.prop --value-prefix 1 > /tmp/produce_result.txt
{code}
During the production, i stop, start, stop start broker 1.
{code:java}
/etc/init.d/kafka1 stop && /etc/init.d/kafka1 start && /etc/init.d/kafka1 stop && /etc/init.d/kafka1 start
{code}
There is no data loss producer side:
{code:java}
{"timestamp":1524670799034,"name":"producer_send_success","key":null,"value":"1.999999","topic":"test","partition":0,"offset":999999}
{"timestamp":1524670799040,"name":"shutdown_complete"}
{"timestamp":1524670799042,"name":"tool_data","sent":1000000,"acked":1000000,"target_throughput":10000,"avg_throughput":9988.413440409126}
{code}
I consume messages with my simple shell consumer:
{code:java}
/usr/share/kafka/bin/kafka-simple-consumer-shell.sh --broker-list localhost:9091 --topic test --offset -2 2> /dev/null | grep -v "Reached" > /tmp/kafka_data_back.txt
{code}
I grep values "1." in the /tmp/kafka_data_back.txt
{code:java}
Every 0.1s: grep "1\." /tmp/kafka_data_back.txt | wc -l                                                                                                                                                              Wed Apr 25 17:48:46 2018

999937
{code}
Got only 999 937 messages instead of 1 000 000

* I can restart the consumer any time, i will still got 999937.
* Depending on the run, i will get more or less messages stuck.
* I can restart kafka1, wait and restart kafka2, messages are still stuck.
* I can produce more messages, this will not unlock the messages untill =~ 700 messages produced.
* Disabling compression did not solve the problem.
* Ack 1 or -1 got the same result.
* Each run reproduce the problem. Starting from a brand new broker or not.

Can you help me understand why messages are stuck ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)