You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Scott Clasen <sc...@heroku.com> on 2013/08/22 21:50:01 UTC

message loss

So looks like there is a jespen post coming on kafka 0.8 replication, based
on this thats circulating on twitter. https://www.refheap.com/17932/raw

Understanding that kafka isnt designed particularly to be partition
tolerant, the result is not completely surprising.

But my question is, is there something that can be done about the lost
messages?

>From my understanding when broker n1 comes back on line, currently what
will happen is that the messages that were only on n1 will be
truncated/tossed while n1 is coming back to ISR. Please correct me if this
is not accurate.

Would it instead be possible to do something else with them, like sending
them to an internal lost messages topic, or log file where some manual
intervenion could be done on them, or a configuration property like
replay.truncated.messages=true could be set where the broker would send the
lost messages back onto the topic after ISR?

Re: message loss

Posted by Neha Narkhede <ne...@gmail.com>.

I agree with you. If we include that knob, applications can choose their
consistency vs availability tradeoff according to the respective
requirements. I will file a JIRA for this.

Thanks,
Neha


On Thu, Aug 22, 2013 at 2:10 PM, Scott Clasen <sc...@heroku.com> wrote:

> +1 for that knob on a per topic basis, choosing consistency over
> availability would open kafka to more use cases no?
>
> Sent from my iPhone
>
> On Aug 22, 2013, at 1:59 PM, Neha Narkhede <ne...@gmail.com>
> wrote:
>
> > Scott,
> >
> > Kafka replication aims to guarantee that committed writes are not lost.
> In
> > other words, as long as leader can be transitioned to a broker that was
> in
> > the ISR, no data will be lost. For increased availability, if there are
> no
> > other brokers in the ISR, we fall back to electing a broker that is not
> > caught up with the current leader, as the new leader. IMO, this is the
> real
> > problem that the post is complaining about.
> >
> > Let me explain his test in more detail-
> >
> > 1. The first part of the test partitions the leader (n1) from other
> brokers
> > (n2-n5). The leader shrinks the ISR to just itself and ends up taking n
> > writes. This is not a problem all by itself. Once the partition is
> > resolved, n2-n5 would catch up from the leader and no writes will be
> lost,
> > since n1 would continue to serve as the leader.
> > 2. The problem starts in the second part of the test where it partitions
> > the leader (n1) from zookeeper. This causes the unclean leader election
> > (mentioned above), which causes Kafka to lose data.
> >
> > We thought about this while designing replication, but never ended up
> > including the feature that would allow some applications to pick
> > consistency over availability. Basically, we could let applications pick
> > some topics for which the controller will never attempt unclean leader
> > election. The result is that Kafka would reject writes and mark the
> > partition offline, instead of moving leadership to a broker that is not
> in
> > ISR, and losing the writes.
> >
> > I think if we included this knob, the tests that aphyr (jepsen) ran,
> would
> > make more sense.
> >
> > Thanks,
> > Neha
> >
> >
> > On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <sc...@heroku.com> wrote:
> >
> >> So looks like there is a jespen post coming on kafka 0.8 replication,
> based
> >> on this thats circulating on twitter. https://www.refheap.com/17932/raw
> >>
> >> Understanding that kafka isnt designed particularly to be partition
> >> tolerant, the result is not completely surprising.
> >>
> >> But my question is, is there something that can be done about the lost
> >> messages?
> >>
> >> From my understanding when broker n1 comes back on line, currently what
> >> will happen is that the messages that were only on n1 will be
> >> truncated/tossed while n1 is coming back to ISR. Please correct me if
> this
> >> is not accurate.
> >>
> >> Would it instead be possible to do something else with them, like
> sending
> >> them to an internal lost messages topic, or log file where some manual
> >> intervenion could be done on them, or a configuration property like
> >> replay.truncated.messages=true could be set where the broker would send
> the
> >> lost messages back onto the topic after ISR?
> >>
>

Re: message loss

Posted by Scott Clasen <sc...@heroku.com>.

+1 for that knob on a per topic basis, choosing consistency over availability would open kafka to more use cases no?

Sent from my iPhone

On Aug 22, 2013, at 1:59 PM, Neha Narkhede <ne...@gmail.com> wrote:

> Scott,
> 
> Kafka replication aims to guarantee that committed writes are not lost. In
> other words, as long as leader can be transitioned to a broker that was in
> the ISR, no data will be lost. For increased availability, if there are no
> other brokers in the ISR, we fall back to electing a broker that is not
> caught up with the current leader, as the new leader. IMO, this is the real
> problem that the post is complaining about.
> 
> Let me explain his test in more detail-
> 
> 1. The first part of the test partitions the leader (n1) from other brokers
> (n2-n5). The leader shrinks the ISR to just itself and ends up taking n
> writes. This is not a problem all by itself. Once the partition is
> resolved, n2-n5 would catch up from the leader and no writes will be lost,
> since n1 would continue to serve as the leader.
> 2. The problem starts in the second part of the test where it partitions
> the leader (n1) from zookeeper. This causes the unclean leader election
> (mentioned above), which causes Kafka to lose data.
> 
> We thought about this while designing replication, but never ended up
> including the feature that would allow some applications to pick
> consistency over availability. Basically, we could let applications pick
> some topics for which the controller will never attempt unclean leader
> election. The result is that Kafka would reject writes and mark the
> partition offline, instead of moving leadership to a broker that is not in
> ISR, and losing the writes.
> 
> I think if we included this knob, the tests that aphyr (jepsen) ran, would
> make more sense.
> 
> Thanks,
> Neha
> 
> 
> On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <sc...@heroku.com> wrote:
> 
>> So looks like there is a jespen post coming on kafka 0.8 replication, based
>> on this thats circulating on twitter. https://www.refheap.com/17932/raw
>> 
>> Understanding that kafka isnt designed particularly to be partition
>> tolerant, the result is not completely surprising.
>> 
>> But my question is, is there something that can be done about the lost
>> messages?
>> 
>> From my understanding when broker n1 comes back on line, currently what
>> will happen is that the messages that were only on n1 will be
>> truncated/tossed while n1 is coming back to ISR. Please correct me if this
>> is not accurate.
>> 
>> Would it instead be possible to do something else with them, like sending
>> them to an internal lost messages topic, or log file where some manual
>> intervenion could be done on them, or a configuration property like
>> replay.truncated.messages=true could be set where the broker would send the
>> lost messages back onto the topic after ISR?
>>

Re: message loss

Posted by Neha Narkhede <ne...@gmail.com>.

Scott,

Kafka replication aims to guarantee that committed writes are not lost. In
other words, as long as leader can be transitioned to a broker that was in
the ISR, no data will be lost. For increased availability, if there are no
other brokers in the ISR, we fall back to electing a broker that is not
caught up with the current leader, as the new leader. IMO, this is the real
problem that the post is complaining about.

Let me explain his test in more detail-

1. The first part of the test partitions the leader (n1) from other brokers
(n2-n5). The leader shrinks the ISR to just itself and ends up taking n
writes. This is not a problem all by itself. Once the partition is
resolved, n2-n5 would catch up from the leader and no writes will be lost,
since n1 would continue to serve as the leader.
2. The problem starts in the second part of the test where it partitions
the leader (n1) from zookeeper. This causes the unclean leader election
(mentioned above), which causes Kafka to lose data.

We thought about this while designing replication, but never ended up
including the feature that would allow some applications to pick
consistency over availability. Basically, we could let applications pick
some topics for which the controller will never attempt unclean leader
election. The result is that Kafka would reject writes and mark the
partition offline, instead of moving leadership to a broker that is not in
ISR, and losing the writes.

I think if we included this knob, the tests that aphyr (jepsen) ran, would
make more sense.

Thanks,
Neha

On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <sc...@heroku.com> wrote:

> So looks like there is a jespen post coming on kafka 0.8 replication, based
> on this thats circulating on twitter. https://www.refheap.com/17932/raw
>
> Understanding that kafka isnt designed particularly to be partition
> tolerant, the result is not completely surprising.
>
> But my question is, is there something that can be done about the lost
> messages?
>
> From my understanding when broker n1 comes back on line, currently what
> will happen is that the messages that were only on n1 will be
> truncated/tossed while n1 is coming back to ISR. Please correct me if this
> is not accurate.
>
> Would it instead be possible to do something else with them, like sending
> them to an internal lost messages topic, or log file where some manual
> intervenion could be done on them, or a configuration property like
> replay.truncated.messages=true could be set where the broker would send the
> lost messages back onto the topic after ISR?
>