You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Jay Kreps <ja...@gmail.com> on 2013/04/10 07:34:05 UTC

interesting paper on log replication

Very similar in design to kafka replication
https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf

-Jay

Re: interesting paper on log replication

Posted by Jun Rao <ju...@gmail.com>.

On the last point, in general, Kafka logs are identical among replicas. The
only case that they may not be identical is when an unclean leader election
happens, i.e., a leader has to be elected from a replica not in in-sync
replica set). Unclean leader election should be rare since this requires
multiple broker failures around the same time.

Thanks,

Jun


On Tue, Apr 16, 2013 at 11:14 AM, Neha Narkhede <ne...@gmail.com>wrote:

> More notable differences from Kafka as far as log replication protocol
> is concerned -
>
> - Raft considers log entries as committed as soon as it is
> acknowledged by a majority of the servers in a cluster. Compare this
> to Kafka where we have the notion of "in-sync followers" that are
> required to ack every batch of log entries in order for the leader to
> commit those.
>
> - Raft uses the election voting mechanism to select a new  leader
> whose log is as “up-to-date” as possible. Compare this to Kafka where
> we can pick ANY of the "in-sync followers" as the next leader, we
> typically pick the first one in the list. We do not try to pick the
> "in-sync follower" with the largest log for simplicity and fewer RPCs.
>
> - In Raft, when the follower's log diverts from the leader's (in the
> presence of multiple failures), the leader-follower RPC truncates the
> follower's log up to the diversion point and then replicate the rest
> of the leader's log. This ensures that follower's log is identical to
> that of the leader's in such situations. Compare this to Kafka, where
> we allow the logs to divert and don't reconcile perfectly.
>
> Thanks,
> Neha
>
> On Sun, Apr 14, 2013 at 9:42 PM, Jun Rao <ju...@gmail.com> wrote:
> > Thanks for the link. This paper provides an alternative, but similar
> > implementation to that in Zookeeper. The key difference seems to be that
> > the former supports membership reconfiguration.
> >
> > Kafka replication is simpler because it separates the leader election
> part
> > from log replication. Such separation has a few benefits: (1) the leader
> > election part is easier to implement by leveraging a consensus system
> (e.g.
> > Zookeeper); (2) the log format is simpler since the log itself is not
> used
> > for leader election; (3) the replication factor for the log is decoupled
> > from the number of parties required for leader election (e.g., in Kafka
> we
> > can choose a replication factor of 2 for the log while using an ensemble
> of
> > 5 for a Zookeeper cluster).
> >
> > Both Rafe and Zookeeper are solving a harder problem than Kafka
> replication
> > because they have no consensus service to rely upon for their own leader
> > election since they are implementing a consensus service.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, Apr 9, 2013 at 10:34 PM, Jay Kreps <ja...@gmail.com> wrote:
> >
> >> Very similar in design to kafka replication
> >>
> https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
> >>
> >> -Jay
> >>
>

Re: interesting paper on log replication

Posted by Neha Narkhede <ne...@gmail.com>.

More notable differences from Kafka as far as log replication protocol
is concerned -

- Raft considers log entries as committed as soon as it is
acknowledged by a majority of the servers in a cluster. Compare this
to Kafka where we have the notion of "in-sync followers" that are
required to ack every batch of log entries in order for the leader to
commit those.

- Raft uses the election voting mechanism to select a new  leader
whose log is as “up-to-date” as possible. Compare this to Kafka where
we can pick ANY of the "in-sync followers" as the next leader, we
typically pick the first one in the list. We do not try to pick the
"in-sync follower" with the largest log for simplicity and fewer RPCs.

- In Raft, when the follower's log diverts from the leader's (in the
presence of multiple failures), the leader-follower RPC truncates the
follower's log up to the diversion point and then replicate the rest
of the leader's log. This ensures that follower's log is identical to
that of the leader's in such situations. Compare this to Kafka, where
we allow the logs to divert and don't reconcile perfectly.

Thanks,
Neha

On Sun, Apr 14, 2013 at 9:42 PM, Jun Rao <ju...@gmail.com> wrote:
> Thanks for the link. This paper provides an alternative, but similar
> implementation to that in Zookeeper. The key difference seems to be that
> the former supports membership reconfiguration.
>
> Kafka replication is simpler because it separates the leader election part
> from log replication. Such separation has a few benefits: (1) the leader
> election part is easier to implement by leveraging a consensus system (e.g.
> Zookeeper); (2) the log format is simpler since the log itself is not used
> for leader election; (3) the replication factor for the log is decoupled
> from the number of parties required for leader election (e.g., in Kafka we
> can choose a replication factor of 2 for the log while using an ensemble of
> 5 for a Zookeeper cluster).
>
> Both Rafe and Zookeeper are solving a harder problem than Kafka replication
> because they have no consensus service to rely upon for their own leader
> election since they are implementing a consensus service.
>
> Thanks,
>
> Jun
>
>
> On Tue, Apr 9, 2013 at 10:34 PM, Jay Kreps <ja...@gmail.com> wrote:
>
>> Very similar in design to kafka replication
>> https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
>>
>> -Jay
>>

Re: interesting paper on log replication

Posted by Jun Rao <ju...@gmail.com>.

Thanks for the link. This paper provides an alternative, but similar
implementation to that in Zookeeper. The key difference seems to be that
the former supports membership reconfiguration.

Kafka replication is simpler because it separates the leader election part
from log replication. Such separation has a few benefits: (1) the leader
election part is easier to implement by leveraging a consensus system (e.g.
Zookeeper); (2) the log format is simpler since the log itself is not used
for leader election; (3) the replication factor for the log is decoupled
from the number of parties required for leader election (e.g., in Kafka we
can choose a replication factor of 2 for the log while using an ensemble of
5 for a Zookeeper cluster).

Both Rafe and Zookeeper are solving a harder problem than Kafka replication
because they have no consensus service to rely upon for their own leader
election since they are implementing a consensus service.

Thanks,

Jun

On Tue, Apr 9, 2013 at 10:34 PM, Jay Kreps <ja...@gmail.com> wrote:

> Very similar in design to kafka replication
> https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
>
> -Jay
>