You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Joe Stein (JIRA)" <ji...@apache.org> on 2014/07/25 08:12:40 UTC

[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability

    [ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074122#comment-14074122 ] 

Joe Stein edited comment on KAFKA-1555 at 7/25/14 6:11 AM:
-----------------------------------------------------------

The simplification and guarantee provided with ack=-1 works properly if the client implements it in a way that works best for them.

Kafka guarantees durability on R-1 failures (with replication factor R) without message loss if you use ack=-1 ... so...  that is the foundation we build upon.

When we (the client application f.k.a. the producer) do not want "message loss" then setup at least 3 data centers (since you need a majority of data centers for a Zookeeper ensemble to work which requires at least three). Then you write to two topics each with ack=-1. The broker leaders for the partitions of each of these topics that you are writing to """MUST""" be in different data centers.  After sync writing to them them "join" their result (I say "join" since your specific implementation may use another function name like subscribe or onComplete or something) then you get... a durable write from the client perspective without loosing data after it has had a successful "transaction".

If that doesn't work for you then you must accept "data loss" as a failure of at least one data center (which has nothing todo with Kafka).  

This can be (should be) further extended to make sure that at least two racks in each data center get the data. 

If you are not concerned with being able to sustain a loss of 1 data center + a loss of 1 rack in another available data center (at the same time) then this is not a solution for you.

Now, all of what I just said is manual (to make sure replicas and partitions are in different data centers and racks) and static but it works with Kafka tools out of the box and some (lots) of software engineering power (scripting effort with a couple of late nights burning the midnight oil).

As far as producing this to multiple topics there are lots of ways to-do this on the client side running in parallel without much (if any) latency cost (with a little bit more software engineering).  You can use Akka or anything else where you can get a future after sending off multiple events and then subscribing to them onComplete before deciding (returning to your caller or fulfilling the promise) that the "message" has been "written".  

Hopefully this make sense and I appreciate that not all (even most) use cases need this multi data center + multi rack type of sustainability but it works with Kafka if you go by what Kafka guarantees without trying to change it unnecessarily.

If there are defects we should fix them but going up and down this thread I am getting a bit lost in what we should be doing (if anything) to the current code now.




was (Author: joestein):
The simplification and guarantee provided with ack=-1 works properly if the client implements it in a way that works best for them.

Kafka guarantees durability on R-1 failures (with replication factor R) without message loss if you use ack=-1 ... so...  that is the foundation we build upon.

When we (the client application f.k.a. the producer) do not want "message loss" then setup at least 3 data centers (since you need a majority of data centers for a Zookeeper ensemble to work which requires at least three. Then you write to two topics each with ack=-1. The broker leaders for the partitions of each of these topics that you are writing to """MUST""" be in different data centers.  After sync writing to them them "join" their result (I say "join" since your specific implementation may use another function name like subscribe or onComplete or something) then you get... a durable write from the client perspective without loosing data after it has had a successful "transaction".

If that doesn't work for you then you must accept "data loss" as a failure of at least one data center (which has nothing todo with Kafka).  

This can be (should be) further extended to make sure that at least two racks in each data center get the data. 

If you are not concerned with being able to sustain a loss of 1 data center + a loss of 1 rack in another available data center (at the same time) then this is not a solution for you.

Now, all of what I just said is manual (to make sure replicas and partitions are in different data centers and racks) and static but it works with Kafka tools out of the box and some (lots) of software engineering power (scripting effort with a couple of late nights burning the midnight oil).

As far as producing this to multiple topics there are lots of ways to-do this on the client side running in parallel without much (if any) latency cost (with a little bit more software engineering).  You can use Akka or anything else where you can get a future after sending off multiple events and then subscribing to them onComplete before deciding (returning to your caller or fulfilling the promise) that the "message" has been "written".  

Hopefully this make sense and I appreciate that not all (even most) use cases need this multi data center + multi rack type of sustainability but it works with Kafka if you go by what Kafka guarantees without trying to change it unnecessarily.

If there are defects we should fix them but going up and down this thread I am getting a bit lost in what we should be doing (if anything) to the current code now.



> provide strong consistency with reasonable availability
> -------------------------------------------------------
>
>                 Key: KAFKA-1555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Jiang Wu
>            Assignee: Neha Narkhede
>
> In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements:
> 1. When 1 broker is down, no message loss or service blocking happens.
> 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens.
> We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader.
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader.
> 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker.
> The following is an analytical proof. 
> We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR.
> According to the value of request.required.acks (acks for short), there are the following cases.
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m.
> 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost.
> In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)