You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jun Rao <ju...@gmail.com> on 2013/02/05 20:21:18 UTC

kafka replication blog

I just posted the following blog on Kafka replication. This may answer some
of the questions that a few people have asked in the mailing list before.

http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka

Thanks,

Jun

Re: kafka replication blog

Posted by Michal Haris <mi...@visualdna.com>.
Thanks, makes sense.
On Feb 8, 2013 4:00 PM, "Jun Rao" <ju...@gmail.com> wrote:

> That's right. If you are partitioning by key, that means you insist a
> message has to go to a certain partition, whether it's available or not.
> So, if a partition is not available, we will drop the message for the
> partition in the async mode and consistently throw an exception to the
> caller in the sync mode.
>
> Thanks,
>
> Jun
>
> On Fri, Feb 8, 2013 at 1:41 AM, Michal Haris <michal.haris@visualdna.com
> >wrote:
>
> > So if the produces are partitioning by key we have to have replication if
> > we dont want messages to get lost when partition goes down l right ?
> > Thanks
> > On Feb 8, 2013 5:12 AM, "Jun Rao" <ju...@gmail.com> wrote:
> >
> > > We have fixed this issue in 0.8. Withreplication factor 1, if the
> > producer
> > > doesn't care about partitioning by key, messages will be sent to
> > partitions
> > > that are currently available.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <
> michal.haris@visualdna.com
> > > >wrote:
> > >
> > > > Same here, summary was need as we have a fairly large ecosystem of
> > > multiple
> > > > 0.7.2 clusters and I am planning to test upgrade to 0.8.
> > > > However, one thing  creeping at the back of my mind regarding 0.8 is
> > > > something i have spotted in one thread few weeks ago namely that the
> > > > rebalance behaviour of producers is not as robust as in 0.7.x without
> > > > replication and i remeber there was no designed solution at the time
> -
> > > any
> > > > news here ? Basically our usecase doesn't require replication but
> > logical
> > > > offsets and some other things introduced would solve some problems.
> > > > On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com>
> wrote:
> > > >
> > > > > Same here. Thanks a lot Jun.
> > > > >
> > > > > Regards,
> > > > > Vaibhav
> > > > >
> > > > > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com>
> > wrote:
> > > > >
> > > > > > Thanks Jun!
> > > > > >
> > > > > > I hadn't been following the discussions regarding 0.8 and
> > replication
> > > > > for a
> > > > > > little while and this was a great post to refresh my memory and
> get
> > > up
> > > > to
> > > > > > speed on the current replication architecture's design.
> > > > > >
> > > > > > --
> > > > > > Felix
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com>
> wrote:
> > > > > >
> > > > > > > I just posted the following blog on Kafka replication. This may
> > > > answer
> > > > > > some
> > > > > > > of the questions that a few people have asked in the mailing
> list
> > > > > before.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Michal Haris <mi...@visualdna.com>.
Thanks Jun, makes sense.
On Feb 8, 2013 4:00 PM, "Jun Rao" <ju...@gmail.com> wrote:

> That's right. If you are partitioning by key, that means you insist a
> message has to go to a certain partition, whether it's available or not.
> So, if a partition is not available, we will drop the message for the
> partition in the async mode and consistently throw an exception to the
> caller in the sync mode.
>
> Thanks,
>
> Jun
>
> On Fri, Feb 8, 2013 at 1:41 AM, Michal Haris <michal.haris@visualdna.com
> >wrote:
>
> > So if the produces are partitioning by key we have to have replication if
> > we dont want messages to get lost when partition goes down l right ?
> > Thanks
> > On Feb 8, 2013 5:12 AM, "Jun Rao" <ju...@gmail.com> wrote:
> >
> > > We have fixed this issue in 0.8. Withreplication factor 1, if the
> > producer
> > > doesn't care about partitioning by key, messages will be sent to
> > partitions
> > > that are currently available.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <
> michal.haris@visualdna.com
> > > >wrote:
> > >
> > > > Same here, summary was need as we have a fairly large ecosystem of
> > > multiple
> > > > 0.7.2 clusters and I am planning to test upgrade to 0.8.
> > > > However, one thing  creeping at the back of my mind regarding 0.8 is
> > > > something i have spotted in one thread few weeks ago namely that the
> > > > rebalance behaviour of producers is not as robust as in 0.7.x without
> > > > replication and i remeber there was no designed solution at the time
> -
> > > any
> > > > news here ? Basically our usecase doesn't require replication but
> > logical
> > > > offsets and some other things introduced would solve some problems.
> > > > On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com>
> wrote:
> > > >
> > > > > Same here. Thanks a lot Jun.
> > > > >
> > > > > Regards,
> > > > > Vaibhav
> > > > >
> > > > > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com>
> > wrote:
> > > > >
> > > > > > Thanks Jun!
> > > > > >
> > > > > > I hadn't been following the discussions regarding 0.8 and
> > replication
> > > > > for a
> > > > > > little while and this was a great post to refresh my memory and
> get
> > > up
> > > > to
> > > > > > speed on the current replication architecture's design.
> > > > > >
> > > > > > --
> > > > > > Felix
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com>
> wrote:
> > > > > >
> > > > > > > I just posted the following blog on Kafka replication. This may
> > > > answer
> > > > > > some
> > > > > > > of the questions that a few people have asked in the mailing
> list
> > > > > before.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Jun Rao <ju...@gmail.com>.
That's right. If you are partitioning by key, that means you insist a
message has to go to a certain partition, whether it's available or not.
So, if a partition is not available, we will drop the message for the
partition in the async mode and consistently throw an exception to the
caller in the sync mode.

Thanks,

Jun

On Fri, Feb 8, 2013 at 1:41 AM, Michal Haris <mi...@visualdna.com>wrote:

> So if the produces are partitioning by key we have to have replication if
> we dont want messages to get lost when partition goes down l right ?
> Thanks
> On Feb 8, 2013 5:12 AM, "Jun Rao" <ju...@gmail.com> wrote:
>
> > We have fixed this issue in 0.8. Withreplication factor 1, if the
> producer
> > doesn't care about partitioning by key, messages will be sent to
> partitions
> > that are currently available.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <michal.haris@visualdna.com
> > >wrote:
> >
> > > Same here, summary was need as we have a fairly large ecosystem of
> > multiple
> > > 0.7.2 clusters and I am planning to test upgrade to 0.8.
> > > However, one thing  creeping at the back of my mind regarding 0.8 is
> > > something i have spotted in one thread few weeks ago namely that the
> > > rebalance behaviour of producers is not as robust as in 0.7.x without
> > > replication and i remeber there was no designed solution at the time -
> > any
> > > news here ? Basically our usecase doesn't require replication but
> logical
> > > offsets and some other things introduced would solve some problems.
> > > On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:
> > >
> > > > Same here. Thanks a lot Jun.
> > > >
> > > > Regards,
> > > > Vaibhav
> > > >
> > > > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com>
> wrote:
> > > >
> > > > > Thanks Jun!
> > > > >
> > > > > I hadn't been following the discussions regarding 0.8 and
> replication
> > > > for a
> > > > > little while and this was a great post to refresh my memory and get
> > up
> > > to
> > > > > speed on the current replication architecture's design.
> > > > >
> > > > > --
> > > > > Felix
> > > > >
> > > > >
> > > > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> > > > >
> > > > > > I just posted the following blog on Kafka replication. This may
> > > answer
> > > > > some
> > > > > > of the questions that a few people have asked in the mailing list
> > > > before.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Michal Haris <mi...@visualdna.com>.
So if the produces are partitioning by key we have to have replication if
we dont want messages to get lost when partition goes down l right ?
Thanks
On Feb 8, 2013 5:12 AM, "Jun Rao" <ju...@gmail.com> wrote:

> We have fixed this issue in 0.8. Withreplication factor 1, if the producer
> doesn't care about partitioning by key, messages will be sent to partitions
> that are currently available.
>
> Thanks,
>
> Jun
>
> On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <michal.haris@visualdna.com
> >wrote:
>
> > Same here, summary was need as we have a fairly large ecosystem of
> multiple
> > 0.7.2 clusters and I am planning to test upgrade to 0.8.
> > However, one thing  creeping at the back of my mind regarding 0.8 is
> > something i have spotted in one thread few weeks ago namely that the
> > rebalance behaviour of producers is not as robust as in 0.7.x without
> > replication and i remeber there was no designed solution at the time -
> any
> > news here ? Basically our usecase doesn't require replication but logical
> > offsets and some other things introduced would solve some problems.
> > On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:
> >
> > > Same here. Thanks a lot Jun.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:
> > >
> > > > Thanks Jun!
> > > >
> > > > I hadn't been following the discussions regarding 0.8 and replication
> > > for a
> > > > little while and this was a great post to refresh my memory and get
> up
> > to
> > > > speed on the current replication architecture's design.
> > > >
> > > > --
> > > > Felix
> > > >
> > > >
> > > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> > > >
> > > > > I just posted the following blog on Kafka replication. This may
> > answer
> > > > some
> > > > > of the questions that a few people have asked in the mailing list
> > > before.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Jun Rao <ju...@gmail.com>.
We have fixed this issue in 0.8. Withreplication factor 1, if the producer
doesn't care about partitioning by key, messages will be sent to partitions
that are currently available.

Thanks,

Jun

On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <mi...@visualdna.com>wrote:

> Same here, summary was need as we have a fairly large ecosystem of multiple
> 0.7.2 clusters and I am planning to test upgrade to 0.8.
> However, one thing  creeping at the back of my mind regarding 0.8 is
> something i have spotted in one thread few weeks ago namely that the
> rebalance behaviour of producers is not as robust as in 0.7.x without
> replication and i remeber there was no designed solution at the time - any
> news here ? Basically our usecase doesn't require replication but logical
> offsets and some other things introduced would solve some problems.
> On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:
>
> > Same here. Thanks a lot Jun.
> >
> > Regards,
> > Vaibhav
> >
> > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:
> >
> > > Thanks Jun!
> > >
> > > I hadn't been following the discussions regarding 0.8 and replication
> > for a
> > > little while and this was a great post to refresh my memory and get up
> to
> > > speed on the current replication architecture's design.
> > >
> > > --
> > > Felix
> > >
> > >
> > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> > >
> > > > I just posted the following blog on Kafka replication. This may
> answer
> > > some
> > > > of the questions that a few people have asked in the mailing list
> > before.
> > > >
> > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Jun Rao <ju...@gmail.com>.
We have fixed this issue in 0.8. Withreplication factor 1, if the producer
doesn't care about partitioning by key, messages will be sent to partitions
that are currently available.

Thanks,

Jun

On Thu, Feb 7, 2013 at 3:11 PM, Michal Haris <mi...@visualdna.com>wrote:

> Same here, summary was need as we have a fairly large ecosystem of multiple
> 0.7.2 clusters and I am planning to test upgrade to 0.8.
> However, one thing  creeping at the back of my mind regarding 0.8 is
> something i have spotted in one thread few weeks ago namely that the
> rebalance behaviour of producers is not as robust as in 0.7.x without
> replication and i remeber there was no designed solution at the time - any
> news here ? Basically our usecase doesn't require replication but logical
> offsets and some other things introduced would solve some problems.
> On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:
>
> > Same here. Thanks a lot Jun.
> >
> > Regards,
> > Vaibhav
> >
> > On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:
> >
> > > Thanks Jun!
> > >
> > > I hadn't been following the discussions regarding 0.8 and replication
> > for a
> > > little while and this was a great post to refresh my memory and get up
> to
> > > speed on the current replication architecture's design.
> > >
> > > --
> > > Felix
> > >
> > >
> > > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> > >
> > > > I just posted the following blog on Kafka replication. This may
> answer
> > > some
> > > > of the questions that a few people have asked in the mailing list
> > before.
> > > >
> > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > >
> >
>

Re: kafka replication blog

Posted by Michal Haris <mi...@visualdna.com>.
Same here, summary was need as we have a fairly large ecosystem of multiple
0.7.2 clusters and I am planning to test upgrade to 0.8.
However, one thing  creeping at the back of my mind regarding 0.8 is
something i have spotted in one thread few weeks ago namely that the
rebalance behaviour of producers is not as robust as in 0.7.x without
replication and i remeber there was no designed solution at the time - any
news here ? Basically our usecase doesn't require replication but logical
offsets and some other things introduced would solve some problems.
On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:

> Same here. Thanks a lot Jun.
>
> Regards,
> Vaibhav
>
> On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:
>
> > Thanks Jun!
> >
> > I hadn't been following the discussions regarding 0.8 and replication
> for a
> > little while and this was a great post to refresh my memory and get up to
> > speed on the current replication architecture's design.
> >
> > --
> > Felix
> >
> >
> > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > I just posted the following blog on Kafka replication. This may answer
> > some
> > > of the questions that a few people have asked in the mailing list
> before.
> > >
> > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> >
>

Re: kafka replication blog

Posted by Michal Haris <mi...@visualdna.com>.
Same here, summary was need as we have a fairly large ecosystem of multiple
0.7.2 clusters and I am planning to test upgrade to 0.8.
However, one thing  creeping at the back of my mind regarding 0.8 is
something i have spotted in one thread few weeks ago namely that the
rebalance behaviour of producers is not as robust as in 0.7.x without
replication and i remeber there was no designed solution at the time - any
news here ? Basically our usecase doesn't require replication but logical
offsets and some other things introduced would solve some problems.
On Feb 7, 2013 7:11 PM, "Vaibhav Puranik" <vp...@gmail.com> wrote:

> Same here. Thanks a lot Jun.
>
> Regards,
> Vaibhav
>
> On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:
>
> > Thanks Jun!
> >
> > I hadn't been following the discussions regarding 0.8 and replication
> for a
> > little while and this was a great post to refresh my memory and get up to
> > speed on the current replication architecture's design.
> >
> > --
> > Felix
> >
> >
> > On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > I just posted the following blog on Kafka replication. This may answer
> > some
> > > of the questions that a few people have asked in the mailing list
> before.
> > >
> > >
> > >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> >
>

Re: kafka replication blog

Posted by Vaibhav Puranik <vp...@gmail.com>.
Same here. Thanks a lot Jun.

Regards,
Vaibhav

On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:

> Thanks Jun!
>
> I hadn't been following the discussions regarding 0.8 and replication for a
> little while and this was a great post to refresh my memory and get up to
> speed on the current replication architecture's design.
>
> --
> Felix
>
>
> On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > I just posted the following blog on Kafka replication. This may answer
> some
> > of the questions that a few people have asked in the mailing list before.
> >
> >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> >
> > Thanks,
> >
> > Jun
> >
>

Re: kafka replication blog

Posted by Vaibhav Puranik <vp...@gmail.com>.
Same here. Thanks a lot Jun.

Regards,
Vaibhav

On Thu, Feb 7, 2013 at 10:38 AM, Felix GV <fe...@mate1inc.com> wrote:

> Thanks Jun!
>
> I hadn't been following the discussions regarding 0.8 and replication for a
> little while and this was a great post to refresh my memory and get up to
> speed on the current replication architecture's design.
>
> --
> Felix
>
>
> On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > I just posted the following blog on Kafka replication. This may answer
> some
> > of the questions that a few people have asked in the mailing list before.
> >
> >
> >
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
> >
> > Thanks,
> >
> > Jun
> >
>

Re: kafka replication blog

Posted by Felix GV <fe...@mate1inc.com>.
Thanks Jun!

I hadn't been following the discussions regarding 0.8 and replication for a
little while and this was a great post to refresh my memory and get up to
speed on the current replication architecture's design.

--
Felix


On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:

> I just posted the following blog on Kafka replication. This may answer some
> of the questions that a few people have asked in the mailing list before.
>
>
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
>
> Thanks,
>
> Jun
>

Re: kafka replication blog

Posted by Felix GV <fe...@mate1inc.com>.
Thanks Jun!

I hadn't been following the discussions regarding 0.8 and replication for a
little while and this was a great post to refresh my memory and get up to
speed on the current replication architecture's design.

--
Felix


On Tue, Feb 5, 2013 at 2:21 PM, Jun Rao <ju...@gmail.com> wrote:

> I just posted the following blog on Kafka replication. This may answer some
> of the questions that a few people have asked in the mailing list before.
>
>
> http://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka
>
> Thanks,
>
> Jun
>