You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2014/09/05 00:02:25 UTC

[jira] [Resolved] (KAFKA-1193) Data loss if broker is killed using kill -9

     [ https://issues.apache.org/jira/browse/KAFKA-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guozhang Wang resolved KAFKA-1193.
----------------------------------
    Resolution: Fixed

> Data loss if broker is killed using kill -9
> -------------------------------------------
>
>                 Key: KAFKA-1193
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1193
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8.0, 0.8.1
>         Environment: Centos 6.3
>            Reporter: Hanish Bansal
>             Fix For: 0.8.2
>
>
> We are having kafka cluster of 2 nodes. (Using Kafka 0.8.0 version)
> Replication Factor: 2
> Number of partitions: 2
> Actual Behaviour:
> -------------------------
> Out of two nodes, if leader node goes down then data lost happens.
> Steps to Reproduce:
> ------------------------------
> 1. Create a 2 node kafka cluster with replication factor 2
> 2. Start the Kafka cluster
> 3. Create a topic lets say "test-trunk111"
> 4. Restart any one node.
> 5. Check topic status using kafka-list-topic tool.
> topic isr status is:
> topic: test-trunk111    partition: 0    leader: 0    replicas: 1,0    isr: 0,1
> topic: test-trunk111    partition: 1    leader: 0    replicas: 0,1    isr: 0,1
> If there is only one broker node in isr list then wait for some time and again check isr status of topic. There should be 2 brokers in isr list.
> 6. Start producing the data.
> 7. Kill leader node (borker-0 in our case) meanwhile of data producing.
> 8. After all data is produced start consumer.
> 9. Observe the behaviour. There is data loss.
> After leader goes down, topic isr status is:
> topic: test-trunk111    partition: 0    leader: 1    replicas: 1,0    isr: 1
> topic: test-trunk111    partition: 1    leader: 1    replicas: 0,1    isr: 1
> We have tried below things to avoid data loss:
> ----------------------------------------------------------------
> 1. Configured "request.required.acks=-1" in producer configuration because as mentioned in documentation http://kafka.apache.org/documentation.html#producerconfigs, setting this value to -1 provides guarantee that no messages will be lost.
> 2. Increased the "message.send.max.retries" from 3 to 10 in producer configuration.
> 3. Set "controlled.shutdown.enable" to true in broker configuration.
> 4. Tested with Kafka-0.8.1 after applying patch KAFKA-1188.patch available on https://issues.apache.org/jira/browse/KAFKA-1188 
> Nothing work out from above things in case of leader node is killed using "kill -9 <pid>".
> Expected Behaviour:
> ----------------------------
> No data should be lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)