You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Christopher Vollick <ch...@shopify.com.INVALID> on 2018/08/28 19:24:46 UTC

Consumer High Water Mark and min.insync.replicas

Hello!

I haven’t had a lot of luck looking around on the internet, and I suspect the answer I want doesn’t exist.

Let’s say I have replication factor 3 and min.insync.replicas=1 (the default).
That means that I can get into a situation where if two brokers fall behind, then we no longer have redundancy because only one of my brokers has all the data.
But all producers and consumers will be happy.

Then if I lose that one broker and it never comes back, we’re in the situation where we’ll have to enable unclean leader election and data will be lost and consumers will have to reset offsets. Bad.

So, let’s say I have replication factor 3 and min.insync.replicas=2
This means that if we lose 2 brokers, any producer with acks=all will fail to send because there wouldn’t be enough redundancy.
That’s great, but it assumes the producer is the only one who cares about that kind of thing.

What if my consumer wants to be sure that its offsets won’t be reset before it consumes?
Is there an equivalent consume option I haven’t seen yet that says “don’t send me things if they’re not replicated”?

Or even better, is there some way I as a kafka cluster operator (but not a producer or consumer directly) can harden everything and configure the brokers to not move the High Water Mark unless a certain amount of replication is reached?

Like, if the producer sets acks=all, then do what would already work, but if they set acks=1 or acks=0 then allow those to “succeed" as they currently do, but _don’t_ move the High Water mark to allow consumers to consume until after replication has been done.
Yes those producers might lose messages if they never get replicated, but they chose to.
But I’m the one that has to enable unclean leader election when something goes wrong, and the consumers have to tolerate duplicates or message loss when I do that and offsets reset.
Is there nothing I can do to fix this?

How do I make Kafka-the-system more willing to sacrifice Availability for Consistency?