You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Fathima Amara <fa...@wso2.com> on 2017/05/16 10:17:18 UTC

Data loss after a Kafka broker restart scenario.

Hi all,

I am using Kafka 2.11-0.10.0.1 and Zookeeper 3.4.8.
I have a cluster of 4 servers(A,B,C,D) running one kafka broker on each of them and, one zookeeper server on server A. Data is initially produced from server A using a Kafka Producer and it goes through servers B,C,D being subjected to processing and finally reaches server A again(gets consumed using a Kafka Consumer). 

Topics created on the end of each process has 2 partitions with a replication-factor of 3. Other configurations include,
unclean.leader.election.enable=false
acks=all
retries=0
I let the producer run for a while in server A, then kill one of the Kafka brokers on the cluster(B,C,D) while data processing takes place and restart it. When consuming from the end of server A, I notice a considerable amount of data lost which varies on each run! ex:- on an input of 1 million events 5930 events are lost.

Is the reason for this the Kafka Producer not guaranteeing Exactly-once processing or is this due to some other reason? What other reasons cause data loss?

Re: Data loss after a Kafka broker restart scenario.

Posted by Tom Crayford <tc...@heroku.com>.

Fathima,

In 0.11 there will be such a mechanism (see KIP-98), but in current
versions, you have to eat the duplicates if you want to not lose messages.

On Wed, May 17, 2017 at 5:31 AM, Fathima Amara <fa...@wso2.com> wrote:

> Hi Mathieu,
>
> Thanks for replying. I've already tried by setting "retries" to higher
> values(maximum up to 3). Since this introduces duplicates which I do not
> require, I brought the "retries" value back to 0. I would like to know
> whether there is a way to achieve "exactly-once" guarantee having increased
> the retires value?
>
> Fathima
>

Re: Data loss after a Kafka broker restart scenario.

Posted by Fathima Amara <fa...@wso2.com>.

Hi Mathieu,

Thanks for replying. I've already tried by setting "retries" to higher values(maximum up to 3). Since this introduces duplicates which I do not require, I brought the "retries" value back to 0. I would like to know whether there is a way to achieve "exactly-once" guarantee having increased the retires value?

Fathima

Re: Data loss after a Kafka broker restart scenario.

Posted by Mathieu Fenniak <ma...@replicon.com>.

Hi Fathima,

Setting "retries=0" on the producer means that an attempt to produce a
message, if it encounters an error, will result in that message being
lost.  It's likely the producer will encounter intermittent errors when you
kill one broker in the cluster.

I'd suggest trying this test with a higher value for "retries".  Note that
you'll only be guaranteed at-least-once processing, not exactly-once.

Mathieu


On Tue, May 16, 2017 at 4:17 AM, Fathima Amara <fa...@wso2.com> wrote:

>
> Hi all,
>
> I am using Kafka 2.11-0.10.0.1 and Zookeeper 3.4.8.
> I have a cluster of 4 servers(A,B,C,D) running one kafka broker on each of
> them and, one zookeeper server on server A. Data is initially produced from
> server A using a Kafka Producer and it goes through servers B,C,D being
> subjected to processing and finally reaches server A again(gets consumed
> using a Kafka Consumer).
>
> Topics created on the end of each process has 2 partitions with a
> replication-factor of 3. Other configurations include,
> unclean.leader.election.enable=false
> acks=all
> retries=0
> I let the producer run for a while in server A, then kill one of the Kafka
> brokers on the cluster(B,C,D) while data processing takes place and restart
> it. When consuming from the end of server A, I notice a considerable amount
> of data lost which varies on each run! ex:- on an input of 1 million events
> 5930 events are lost.
>
> Is the reason for this the Kafka Producer not guaranteeing Exactly-once
> processing or is this due to some other reason? What other reasons cause
> data loss?
>