You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Michal Borowiecki (JIRA)" <ji...@apache.org> on 2017/07/01 11:36:00 UTC

[jira] [Commented] (KAFKA-5546) Lost data when the leader is disconnected.

    [ https://issues.apache.org/jira/browse/KAFKA-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071187#comment-16071187 ] 

Michal Borowiecki commented on KAFKA-5546:
------------------------------------------

Does your producer check if sending was successful? That is, if the broker acknowledged the message? If the data is never acknowledged by the broker, it can be lost, it's not a defect.
As far as I can see you're just piping data into the console producer and the logs on producers side indicate the messages simply weren't sent. Please correct me if I'm reading it wrong.

Additionally, as far as I can tell, you are using acks=1 and unclean leader election enabled (default in 0.10.2.1 but changed to disabled from 0.11.0.0).
This setup allows loss of even acknowledged messages on leader failure.

If you are trying to set up a resilient kafka cluster, please disable unclean leader election and set ack >1 ("all" is the reasonable default) and of course, check that the messages were sent.

> Lost data when the leader is disconnected.
> ------------------------------------------
>
>                 Key: KAFKA-5546
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5546
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.10.2.1
>            Reporter: Björn Eriksson
>         Attachments: kafka-failure-log.txt
>
>
> We've noticed that if the leaders networking is deconfigured (with {{ifconfig eth0 down}}) the producer won't notice this and doesn't immediately connect to the newly elected leader.
> {{docker-compose.yml}} and test runner are at https://github.com/owbear/kafka-network-failure-tests with sample test output at https://github.com/owbear/kafka-network-failure-tests/blob/master/README.md#sample-results
> I was expecting a transparent failover to the new leader.
> The attached log shows that while the producer produced values between {{12:37:33}} and {{12:37:54}}, theres a gap between {{12:37:41}} and {{12:37:50}} where no values was stored in the log after the network was taken down at {{12:37:42}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)