You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Manmeet Singh (JIRA)" <ji...@apache.org> on 2018/05/07 11:03:00 UTC

[jira] [Commented] (KAFKA-6827) Messages stuck after broker's multiple restart in a row

    [ https://issues.apache.org/jira/browse/KAFKA-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465788#comment-16465788 ] 

Manmeet Singh commented on KAFKA-6827:
--------------------------------------

I tried reproducing it and got the same 999 937 messages. 

For me, the reason was buffering by grep. Could you try running grep with --line-buffered on the consumer command and see if its still the case.

The command for consumer would be 
{code:java}
/usr/share/kafka/bin/kafka-simple-consumer-shell.sh --broker-list localhost:9091 --topic test --offset -2 2> /dev/null | grep --line-buffered -v "Reached" > /tmp/kafka_data_back.txt
{code}

> Messages stuck after broker's multiple restart in a row
> -------------------------------------------------------
>
>                 Key: KAFKA-6827
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6827
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.2.0, 1.1.0
>            Reporter: Yohan Sanchez
>            Priority: Minor
>         Attachments: kafka1.log, kafka2.log, producer.prop
>
>
> Hello :)
> Tried with v0.10.2 and 1.1.0.
>  I start with brand new brokers with no old data.
> Created topic test
> {code:java}
> /usr/share/kafka/bin/kafka-topics.sh --zookeeper $ZOOKEEPER --create --topic test --partitions 1 --replication-factor 2 --config retention.ms=604800000 && /usr/share/kafka/bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
> Created topic "test".
> Topic:test      PartitionCount:1        ReplicationFactor:2     Configs:retention.ms=604800000
>         Topic: test     Partition: 0    Leader: 2       Replicas: 2,1   Isr: 2,1{code}
>  
>  Logs from brokers available in attachment
> I start producing with a verifiable producer
> {code:java}
> /usr/share/kafka/bin/kafka-verifiable-producer.sh --topic test --broker-list localhost:9091 --max-messages 1000000 --throughput 10000 --producer.config producer.prop --value-prefix 1 > /tmp/produce_result.txt
> {code}
> During the production, i stop, start, stop start broker 1.
> {code:java}
> /etc/init.d/kafka1 stop && /etc/init.d/kafka1 start && /etc/init.d/kafka1 stop && /etc/init.d/kafka1 start
> {code}
> There is no data loss producer side:
> {code:java}
> {"timestamp":1524670799034,"name":"producer_send_success","key":null,"value":"1.999999","topic":"test","partition":0,"offset":999999}
> {"timestamp":1524670799040,"name":"shutdown_complete"}
> {"timestamp":1524670799042,"name":"tool_data","sent":1000000,"acked":1000000,"target_throughput":10000,"avg_throughput":9988.413440409126}
> {code}
> I consume messages with my simple shell consumer:
> {code:java}
> /usr/share/kafka/bin/kafka-simple-consumer-shell.sh --broker-list localhost:9091 --topic test --offset -2 2> /dev/null | grep -v "Reached" > /tmp/kafka_data_back.txt
> {code}
> I grep values "1." in the /tmp/kafka_data_back.txt
> {code:java}
> Every 0.1s: grep "1\." /tmp/kafka_data_back.txt | wc -l                                                                                                                                                              Wed Apr 25 17:48:46 2018
> 999937
> {code}
> Got only 999 937 messages instead of 1 000 000
> * I can restart the consumer any time, i will still got 999937.
> * Depending on the run, i will get more or less messages stuck.
> * I can restart kafka1, wait and restart kafka2, messages are still stuck.
> * I can produce more messages, this will not unlock the messages untill =~ 700 messages produced.
> * Disabling compression did not solve the problem.
> * Ack 1 or -1 got the same result.
> * Each run reproduce the problem. Starting from a brand new broker or not.
> Can you help me understand why messages are stuck ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)