You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Rob Withers <re...@gmail.com> on 2013/05/17 03:53:38 UTC

are commitOffsets botched to zookeeper?

We are calling commitOffsets after every message consumption.  It looks to be ~60% slower, with 29 partitions.  If a single KafkaStream thread is from a connector, and there are 29 partitions, then commitOffsets sends 29 offset updates, correct?  Are these offset updates batched in one send to zookeeper?

thanks,
rob

RE: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
Zookeeper 3.4.x is API compatible. However to get full benefits, we will
have to change kafka code to use the batch API that zookeeper 3.4.x
provides. Also, we use zkclient library to interface with zookeeper. We
might have to patch that to use zookeeper 3.4.x APIs.

Thanks,
Neha
On May 20, 2013 9:36 AM, "Seshadri, Balaji" <Ba...@dish.com>
wrote:

> Hi Neha,
>
> Is moving to zookeeper 3.4.x is a big change ?.
>
> Can you please explain parts it affects consumer API for example ?.
>
> Thanks,
>
> Balaji
> -----Original Message-----
> From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:35 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Upgrading to a new zookeeper version is not an easy change. Also zookeeper
> 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> club 2 big changes together. So most likely this will be a post 08 item for
> stability purposes.
>
> Thanks,
> Neha
> On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com>
> wrote:
>
> > Awesome!  Thanks for the clarification.  I would like to offer my strong
> > vote that this get tackled before a beta, to get it firmly into 0.8.
> > Stabilize everything else to the existing use, but make offset updates
> > batched.
> >
> > thanks,
> > rob
> > ________________________________________
> > From: Neha Narkhede [neha.narkhede@gmail.com]
> > Sent: Friday, May 17, 2013 7:17 AM
> > To: users@kafka.apache.org
> > Subject: RE: are commitOffsets botched to zookeeper?
> >
> > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
> > stable and released it will be worth looking into when we can use
> zookeeper
> > 3.4.x.
> >
> > Thanks,
> > Neha
> > On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
> >
> > > Can a request be made to zookeeper for this feature?
> > >
> > > Thanks,
> > > rob
> > >
> > > > -----Original Message-----
> > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > To: users@kafka.apache.org
> > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > >
> > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> > > write
> > > > api. So if you commit after every message at a high rate, it will be
> > slow
> > > and
> > > > inefficient. Besides it will cause zookeeper performance to degrade.
> > > >
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > > >
> > > > > We are calling commitOffsets after every message consumption.  It
> > > > > looks to be ~60% slower, with 29 partitions.  If a single
> KafkaStream
> > > > > thread is from a connector, and there are 29 partitions, then
> > > > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > > > updates batched in one send to zookeeper?
> > > > >
> > > > > thanks,
> > > > > rob
> > >
> > >
> >
>

RE: are commitOffsets botched to zookeeper?

Posted by "Seshadri, Balaji" <Ba...@dish.com>.
Hi Neha,

Is moving to zookeeper 3.4.x is a big change ?.

Can you please explain parts it affects consumer API for example ?.

Thanks,

Balaji
-----Original Message-----
From: Neha Narkhede [mailto:neha.narkhede@gmail.com] 
Sent: Friday, May 17, 2013 7:35 AM
To: users@kafka.apache.org
Subject: RE: are commitOffsets botched to zookeeper?

Upgrading to a new zookeeper version is not an easy change. Also zookeeper
3.3.4 is much more stable compared to 3.4.x. We think it is better not to
club 2 big changes together. So most likely this will be a post 08 item for
stability purposes.

Thanks,
Neha
On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com> wrote:

> Awesome!  Thanks for the clarification.  I would like to offer my strong
> vote that this get tackled before a beta, to get it firmly into 0.8.
> Stabilize everything else to the existing use, but make offset updates
> batched.
>
> thanks,
> rob
> ________________________________________
> From: Neha Narkhede [neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:17 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
> stable and released it will be worth looking into when we can use zookeeper
> 3.4.x.
>
> Thanks,
> Neha
> On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
>
> > Can a request be made to zookeeper for this feature?
> >
> > Thanks,
> > rob
> >
> > > -----Original Message-----
> > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > Sent: Thursday, May 16, 2013 9:53 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: are commitOffsets botched to zookeeper?
> > >
> > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> > write
> > > api. So if you commit after every message at a high rate, it will be
> slow
> > and
> > > inefficient. Besides it will cause zookeeper performance to degrade.
> > >
> > > Thanks,
> > > Neha
> > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > >
> > > > We are calling commitOffsets after every message consumption.  It
> > > > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > > > thread is from a connector, and there are 29 partitions, then
> > > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > > updates batched in one send to zookeeper?
> > > >
> > > > thanks,
> > > > rob
> >
> >
>

Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Scott Clasen <sc...@heroku.com>.
to clarify, I meant that Robert could/should store the offsets in a faster
store not that kafka should default to that :)

Thanks Neha


On Fri, May 17, 2013 at 2:22 PM, Neha Narkhede <ne...@gmail.com>wrote:

> There is no particular need for storing the offsets in zookeeper. In fact
> with Kafka 0.8, since partitions will be highly available, offsets could be
> stored in Kafka topics. However, we haven't ironed out the design for this
> yet.
>
> Thanks,
> Neha
>
>
> On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <sc...@heroku.com> wrote:
>
> > afaik you dont 'have' to store the consumed offsets in zk right, this is
> > only automatic with some of the clients?
> >
> > why not store them in a data store that can write at the rate that you
> > require?
> >
> >
> > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> Robert.Withers@dish.com
> > >wrote:
> >
> > > Update from our OPS team, regarding zookeeper 3.4.x.  Given stability,
> > > adoption of offset batching would be the only remaining bit of work to
> > do.
> > >  Still, I totally understand the restraint for 0.8...
> > >
> > >
> > > "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > Zookeeper and used it for the upgrade.
> > >
> > > Kafka included version of Zookeeper 3.3.3.
> > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > >
> > > Running, working great. I did *not* have to wipe out the zookeeper
> > > databases. All data stayed intact.
> > >
> > > I got a new feature, which allows auto-purging of logs. This keeps OPS
> > > maintenance to a minimum."
> > >
> > >
> > > thanks,
> > > rob
> > >
> > >
> > > -----Original Message-----
> > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > Sent: Friday, May 17, 2013 7:38 AM
> > > To: users@kafka.apache.org
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > >
> > > Fair enough, this is something to look forward to.  I appreciate the
> > > restraint you show to stay out of troubled waters.  :)
> > >
> > > thanks,
> > > rob
> > >
> > > ________________________________________
> > > From: Neha Narkhede [neha.narkhede@gmail.com]
> > > Sent: Friday, May 17, 2013 7:35 AM
> > > To: users@kafka.apache.org
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > >
> > > Upgrading to a new zookeeper version is not an easy change. Also
> > zookeeper
> > > 3.3.4 is much more stable compared to 3.4.x. We think it is better not
> to
> > > club 2 big changes together. So most likely this will be a post 08 item
> > for
> > > stability purposes.
> > >
> > > Thanks,
> > > Neha
> > > On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com>
> > > wrote:
> > >
> > > > Awesome!  Thanks for the clarification.  I would like to offer my
> > > > strong vote that this get tackled before a beta, to get it firmly
> into
> > > 0.8.
> > > > Stabilize everything else to the existing use, but make offset
> updates
> > > > batched.
> > > >
> > > > thanks,
> > > > rob
> > > > ________________________________________
> > > > From: Neha Narkhede [neha.narkhede@gmail.com]
> > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > To: users@kafka.apache.org
> > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > >
> > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > > > is stable and released it will be worth looking into when we can use
> > > > zookeeper 3.4.x.
> > > >
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
> > > >
> > > > > Can a request be made to zookeeper for this feature?
> > > > >
> > > > > Thanks,
> > > > > rob
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > To: users@kafka.apache.org
> > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > >
> > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > > > batch
> > > > > write
> > > > > > api. So if you commit after every message at a high rate, it will
> > > > > > be
> > > > slow
> > > > > and
> > > > > > inefficient. Besides it will cause zookeeper performance to
> > degrade.
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > We are calling commitOffsets after every message consumption.
> > > > > > > It looks to be ~60% slower, with 29 partitions.  If a single
> > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > partitions, then commitOffsets sends 29 offset updates,
> correct?
> > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > >
> > > > > > > thanks,
> > > > > > > rob
> > > > >
> > > > >
> > > >
> > >
> >
>

RE: Update: RE: are commitOffsets botched to zookeeper?

Posted by Rob Withers <re...@gmail.com>.
Yes, it looks spot on.

Thanks,
rob

> -----Original Message-----
> From: Alex Zuzin [mailto:carnatus@gmail.com]
> Sent: Monday, May 20, 2013 11:37 AM
> To: users@kafka.apache.org
> Subject: Re: Update: RE: are commitOffsets botched to zookeeper?
> 
> Did so. The proposal looks perfectly sensible on first reading.
> 
> I understand that the patches in
> https://issues.apache.org/jira/browse/KAFKA-657 are already in the trunk
> and scheduled for 0.8.1? Are they going out with 0.8? If not, what's ETA for
> 0.8.1?
> 
> Either way, I'm going to try my hand at backing this with MySQL and report
> the results here shortly.
> 
> --
> "If you can't conceal it well, expose it with all your might"
> Alex Zuzin
> 
> 
> On Monday, May 20, 2013 at 10:24 AM, Neha Narkhede wrote:
> 
> > No problem. You can take a look at some of the thoughts we had on
> improving
> > the offset storage here -
> > https://cwiki.apache.org/confluence/display/KAFKA/ffset+Management
> (https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management).
> > Suggestions are welcome.
> >
> > Thanks,
> > Neha
> >
> >
> > On Fri, May 17, 2013 at 2:40 PM, Alex Zuzin <carnatus@gmail.com
> (mailto:carnatus@gmail.com)> wrote:
> >
> > > Neha,
> > >
> > > apologies, I just re-read what I sent and realized my "you" wasn't
> > > specific enough - it meant the Kafka team ;).
> > >
> > > --
> > > "If you can't conceal it well, expose it with all your might"
> > > Alex Zuzin
> > >
> > >
> > > On Friday, May 17, 2013 at 2:25 PM, Alex Zuzin wrote:
> > >
> > > > Have you considered abstracting offset storage away so people could
> > > implement their own?
> > > > Would you take a patch if I'd stabbed at it, and if yes, what's the
> > >
> > > process (pardon the n00b)?
> > > >
> > > > KCBO,
> > > > --
> > > > "If you can't conceal it well, expose it with all your might"
> > > > Alex Zuzin
> > > >
> > > >
> > > > On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:
> > > >
> > > > > There is no particular need for storing the offsets in zookeeper. In
> > > fact
> > > > > with Kafka 0.8, since partitions will be highly available, offsets
> > > >
> > >
> > > could be
> > > > > stored in Kafka topics. However, we haven't ironed out the design for
> > > >
> > >
> > > this
> > > > > yet.
> > > > >
> > > > > Thanks,
> > > > > Neha
> > > > >
> > > > >
> > > > > On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <scott@heroku.com
> (mailto:scott@heroku.com)(mailto:
> > > scott@heroku.com (mailto:scott@heroku.com))> wrote:
> > > > >
> > > > > > afaik you dont 'have' to store the consumed offsets in zk right,
> > > this is
> > > > > > only automatic with some of the clients?
> > > > > >
> > > > > > why not store them in a data store that can write at the rate that
> > > you
> > > > > > require?
> > > > > >
> > > > > >
> > > > > > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> > > Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)
> > > > > > > wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Update from our OPS team, regarding zookeeper 3.4.x. Given
> > > stability,
> > > > > > > adoption of offset batching would be the only remaining bit of
> > > > > >
> > > > >
> > > >
> > >
> > > work to
> > > > > > >
> > > > > >
> > > > > >
> > > > > > do.
> > > > > > > Still, I totally understand the restraint for 0.8...
> > > > > > >
> > > > > > >
> > > > > > > "As exercise in upgradability of zookeeper, I did a
> > > "out-of-the"box"
> > > > > > > upgrade on Zookeeper. I downloaded a generic distribution of
> Apache
> > > > > > > Zookeeper and used it for the upgrade.
> > > > > > >
> > > > > > > Kafka included version of Zookeeper 3.3.3.
> > > > > > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > > > > >
> > > > > > > Running, working great. I did *not* have to wipe out the
> zookeeper
> > > > > > > databases. All data stayed intact.
> > > > > > >
> > > > > > > I got a new feature, which allows auto-purging of logs. This keeps
> > > OPS
> > > > > > > maintenance to a minimum."
> > > > > > >
> > > > > > >
> > > > > > > thanks,
> > > > > > > rob
> > > > > > >
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > > > > > Sent: Friday, May 17, 2013 7:38 AM
> > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > >
> > > > > > > Fair enough, this is something to look forward to. I appreciate the
> > > > > > > restraint you show to stay out of troubled waters. :)
> > > > > > >
> > > > > > > thanks,
> > > > > > > rob
> > > > > > >
> > > > > > > ________________________________________
> > > > > > > From: Neha Narkhede [neha.narkhede@gmail.com
> (mailto:neha.narkhede@gmail.com) (mailto:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com))]
> > > > > > > Sent: Friday, May 17, 2013 7:35 AM
> > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > >
> > > > > > > Upgrading to a new zookeeper version is not an easy change. Also
> > > > > > zookeeper
> > > > > > > 3.3.4 is much more stable compared to 3.4.x. We think it is better
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > not to
> > > > > > > club 2 big changes together. So most likely this will be a post 08
> > > > > >
> > > > >
> > > >
> > >
> > > item
> > > > > > >
> > > > > >
> > > > > >
> > > > > > for
> > > > > > > stability purposes.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > > > On May 17, 2013 6:31 AM, "Withers, Robert" <
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > > > > > strong vote that this get tackled before a beta, to get it
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > firmly into
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 0.8.
> > > > > > > > Stabilize everything else to the existing use, but make offset
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > updates
> > > > > > > > batched.
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > rob
> > > > > > > > ________________________________________
> > > > > > > > From: Neha Narkhede [neha.narkhede@gmail.com
> (mailto:neha.narkhede@gmail.com) (mailto:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com))]
> > > > > > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > > >
> > > > > > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon
> > > as 08
> > > > > > > > is stable and released it will be worth looking into when we can
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > use
> > > > > > > > zookeeper 3.4.x.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Neha
> > > > > > > > On May 16, 2013 10:32 PM, "Rob Withers"
> <reefedjib@gmail.com (mailto:reefedjib@gmail.com)(mailto:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > reefedjib@gmail.com (mailto:reefedjib@gmail.com))> wrote:
> > > > > > > >
> > > > > > > > > Can a request be made to zookeeper for this feature?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > rob
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > > > > > To: users@kafka.apache.org
> (mailto:users@kafka.apache.org)
> > > > > > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > > > > >
> > > > > > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't
> have
> > > a
> > > > > > > > > > batch
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > write
> > > > > > > > > > api. So if you commit after every message at a high rate, it
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > will
> > > > > > > > > > be
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > slow
> > > > > > > > > and
> > > > > > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > degrade.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Neha
> > > > > > > > > > On May 16, 2013 6:54 PM, "Rob Withers"
> <reefedjib@gmail.com (mailto:reefedjib@gmail.com)(mailto:
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > reefedjib@gmail.com (mailto:reefedjib@gmail.com))>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > We are calling commitOffsets after every message
> > > consumption.
> > > > > > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > > > > > partitions, then commitOffsets sends 29 offset updates,
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > correct?
> > > > > > > > > > > Are these offset updates batched in one send to
> zookeeper?
> > > > > > > > > > >
> > > > > > > > > > > thanks,
> > > > > > > > > > > rob
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
> >
> 



Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Alex Zuzin <ca...@gmail.com>.
Did so. The proposal looks perfectly sensible on first reading.

I understand that the patches in https://issues.apache.org/jira/browse/KAFKA-657 are already in the trunk and scheduled for 0.8.1? Are they going out with 0.8? If not, what's ETA for 0.8.1?

Either way, I'm going to try my hand at backing this with MySQL and report the results here shortly.

-- 
"If you can't conceal it well, expose it with all your might"
Alex Zuzin


On Monday, May 20, 2013 at 10:24 AM, Neha Narkhede wrote:

> No problem. You can take a look at some of the thoughts we had on improving
> the offset storage here -
> https://cwiki.apache.org/confluence/display/KAFKA/ffset+Management (https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management).
> Suggestions are welcome.
> 
> Thanks,
> Neha
> 
> 
> On Fri, May 17, 2013 at 2:40 PM, Alex Zuzin <carnatus@gmail.com (mailto:carnatus@gmail.com)> wrote:
> 
> > Neha,
> > 
> > apologies, I just re-read what I sent and realized my "you" wasn't
> > specific enough - it meant the Kafka team ;).
> > 
> > --
> > "If you can't conceal it well, expose it with all your might"
> > Alex Zuzin
> > 
> > 
> > On Friday, May 17, 2013 at 2:25 PM, Alex Zuzin wrote:
> > 
> > > Have you considered abstracting offset storage away so people could
> > implement their own?
> > > Would you take a patch if I'd stabbed at it, and if yes, what's the
> > 
> > process (pardon the n00b)?
> > > 
> > > KCBO,
> > > --
> > > "If you can't conceal it well, expose it with all your might"
> > > Alex Zuzin
> > > 
> > > 
> > > On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:
> > > 
> > > > There is no particular need for storing the offsets in zookeeper. In
> > fact
> > > > with Kafka 0.8, since partitions will be highly available, offsets
> > > 
> > 
> > could be
> > > > stored in Kafka topics. However, we haven't ironed out the design for
> > > 
> > 
> > this
> > > > yet.
> > > > 
> > > > Thanks,
> > > > Neha
> > > > 
> > > > 
> > > > On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <scott@heroku.com (mailto:scott@heroku.com)(mailto:
> > scott@heroku.com (mailto:scott@heroku.com))> wrote:
> > > > 
> > > > > afaik you dont 'have' to store the consumed offsets in zk right,
> > this is
> > > > > only automatic with some of the clients?
> > > > > 
> > > > > why not store them in a data store that can write at the rate that
> > you
> > > > > require?
> > > > > 
> > > > > 
> > > > > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> > Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)
> > > > > > wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > Update from our OPS team, regarding zookeeper 3.4.x. Given
> > stability,
> > > > > > adoption of offset batching would be the only remaining bit of
> > > > > 
> > > > 
> > > 
> > 
> > work to
> > > > > > 
> > > > > 
> > > > > 
> > > > > do.
> > > > > > Still, I totally understand the restraint for 0.8...
> > > > > > 
> > > > > > 
> > > > > > "As exercise in upgradability of zookeeper, I did a
> > "out-of-the"box"
> > > > > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > > > > Zookeeper and used it for the upgrade.
> > > > > > 
> > > > > > Kafka included version of Zookeeper 3.3.3.
> > > > > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > > > > 
> > > > > > Running, working great. I did *not* have to wipe out the zookeeper
> > > > > > databases. All data stayed intact.
> > > > > > 
> > > > > > I got a new feature, which allows auto-purging of logs. This keeps
> > OPS
> > > > > > maintenance to a minimum."
> > > > > > 
> > > > > > 
> > > > > > thanks,
> > > > > > rob
> > > > > > 
> > > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > > > > Sent: Friday, May 17, 2013 7:38 AM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > 
> > > > > > Fair enough, this is something to look forward to. I appreciate the
> > > > > > restraint you show to stay out of troubled waters. :)
> > > > > > 
> > > > > > thanks,
> > > > > > rob
> > > > > > 
> > > > > > ________________________________________
> > > > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com) (mailto:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com))]
> > > > > > Sent: Friday, May 17, 2013 7:35 AM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > 
> > > > > > Upgrading to a new zookeeper version is not an easy change. Also
> > > > > zookeeper
> > > > > > 3.3.4 is much more stable compared to 3.4.x. We think it is better
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > not to
> > > > > > club 2 big changes together. So most likely this will be a post 08
> > > > > 
> > > > 
> > > 
> > 
> > item
> > > > > > 
> > > > > 
> > > > > 
> > > > > for
> > > > > > stability purposes.
> > > > > > 
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 17, 2013 6:31 AM, "Withers, Robert" <
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)>
> > > > > > wrote:
> > > > > > 
> > > > > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > > > > strong vote that this get tackled before a beta, to get it
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > firmly into
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 0.8.
> > > > > > > Stabilize everything else to the existing use, but make offset
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > updates
> > > > > > > batched.
> > > > > > > 
> > > > > > > thanks,
> > > > > > > rob
> > > > > > > ________________________________________
> > > > > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com) (mailto:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com))]
> > > > > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > > 
> > > > > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon
> > as 08
> > > > > > > is stable and released it will be worth looking into when we can
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > use
> > > > > > > zookeeper 3.4.x.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > > > On May 16, 2013 10:32 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)(mailto:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > reefedjib@gmail.com (mailto:reefedjib@gmail.com))> wrote:
> > > > > > > 
> > > > > > > > Can a request be made to zookeeper for this feature?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > rob
> > > > > > > > 
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > > > > 
> > > > > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have
> > a
> > > > > > > > > batch
> > > > > > > > 
> > > > > > > > 
> > > > > > > > write
> > > > > > > > > api. So if you commit after every message at a high rate, it
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > will
> > > > > > > > > be
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > slow
> > > > > > > > and
> > > > > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > degrade.
> > > > > > > > > 
> > > > > > > > > Thanks,
> > > > > > > > > Neha
> > > > > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)(mailto:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > reefedjib@gmail.com (mailto:reefedjib@gmail.com))>
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > wrote:
> > > > > > > > > 
> > > > > > > > > > We are calling commitOffsets after every message
> > consumption.
> > > > > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > > > > partitions, then commitOffsets sends 29 offset updates,
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > correct?
> > > > > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > > > > > 
> > > > > > > > > > thanks,
> > > > > > > > > > rob
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> 



Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
No problem. You can take a look at some of the thoughts we had on improving
the offset storage here -
https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management.
Suggestions are welcome.

Thanks,
Neha


On Fri, May 17, 2013 at 2:40 PM, Alex Zuzin <ca...@gmail.com> wrote:

> Neha,
>
> apologies, I just re-read what I sent and realized my "you" wasn't
> specific enough - it meant the Kafka team ;).
>
> --
> "If you can't conceal it well, expose it with all your might"
> Alex Zuzin
>
>
> On Friday, May 17, 2013 at 2:25 PM, Alex Zuzin wrote:
>
> > Have you considered abstracting offset storage away so people could
> implement their own?
> > Would you take a patch if I'd stabbed at it, and if yes, what's the
> process (pardon the n00b)?
> >
> > KCBO,
> > --
> > "If you can't conceal it well, expose it with all your might"
> > Alex Zuzin
> >
> >
> > On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:
> >
> > > There is no particular need for storing the offsets in zookeeper. In
> fact
> > > with Kafka 0.8, since partitions will be highly available, offsets
> could be
> > > stored in Kafka topics. However, we haven't ironed out the design for
> this
> > > yet.
> > >
> > > Thanks,
> > > Neha
> > >
> > >
> > > On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <scott@heroku.com(mailto:
> scott@heroku.com)> wrote:
> > >
> > > > afaik you dont 'have' to store the consumed offsets in zk right,
> this is
> > > > only automatic with some of the clients?
> > > >
> > > > why not store them in a data store that can write at the rate that
> you
> > > > require?
> > > >
> > > >
> > > > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)
> > > > > wrote:
> > > >
> > > >
> > > > > Update from our OPS team, regarding zookeeper 3.4.x. Given
> stability,
> > > > > adoption of offset batching would be the only remaining bit of
> work to
> > > > >
> > > >
> > > > do.
> > > > > Still, I totally understand the restraint for 0.8...
> > > > >
> > > > >
> > > > > "As exercise in upgradability of zookeeper, I did a
> "out-of-the"box"
> > > > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > > > Zookeeper and used it for the upgrade.
> > > > >
> > > > > Kafka included version of Zookeeper 3.3.3.
> > > > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > > >
> > > > > Running, working great. I did *not* have to wipe out the zookeeper
> > > > > databases. All data stayed intact.
> > > > >
> > > > > I got a new feature, which allows auto-purging of logs. This keeps
> OPS
> > > > > maintenance to a minimum."
> > > > >
> > > > >
> > > > > thanks,
> > > > > rob
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > > > Sent: Friday, May 17, 2013 7:38 AM
> > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > >
> > > > > Fair enough, this is something to look forward to. I appreciate the
> > > > > restraint you show to stay out of troubled waters. :)
> > > > >
> > > > > thanks,
> > > > > rob
> > > > >
> > > > > ________________________________________
> > > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:
> neha.narkhede@gmail.com)]
> > > > > Sent: Friday, May 17, 2013 7:35 AM
> > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > >
> > > > > Upgrading to a new zookeeper version is not an easy change. Also
> > > > zookeeper
> > > > > 3.3.4 is much more stable compared to 3.4.x. We think it is better
> not to
> > > > > club 2 big changes together. So most likely this will be a post 08
> item
> > > > >
> > > >
> > > > for
> > > > > stability purposes.
> > > > >
> > > > > Thanks,
> > > > > Neha
> > > > > On May 17, 2013 6:31 AM, "Withers, Robert" <
> Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)>
> > > > > wrote:
> > > > >
> > > > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > > > strong vote that this get tackled before a beta, to get it
> firmly into
> > > > > >
> > > > >
> > > > > 0.8.
> > > > > > Stabilize everything else to the existing use, but make offset
> updates
> > > > > > batched.
> > > > > >
> > > > > > thanks,
> > > > > > rob
> > > > > > ________________________________________
> > > > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:
> neha.narkhede@gmail.com)]
> > > > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > >
> > > > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon
> as 08
> > > > > > is stable and released it will be worth looking into when we can
> use
> > > > > > zookeeper 3.4.x.
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 16, 2013 10:32 PM, "Rob Withers" <reefedjib@gmail.com(mailto:
> reefedjib@gmail.com)> wrote:
> > > > > >
> > > > > > > Can a request be made to zookeeper for this feature?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > rob
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > > >
> > > > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have
> a
> > > > > > > > batch
> > > > > > > >
> > > > > > >
> > > > > > > write
> > > > > > > > api. So if you commit after every message at a high rate, it
> will
> > > > > > > > be
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > slow
> > > > > > > and
> > > > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > > degrade.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Neha
> > > > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <reefedjib@gmail.com(mailto:
> reefedjib@gmail.com)>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > We are calling commitOffsets after every message
> consumption.
> > > > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > > > partitions, then commitOffsets sends 29 offset updates,
> correct?
> > > > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > > > >
> > > > > > > > > thanks,
> > > > > > > > > rob
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
>
>

Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Alex Zuzin <ca...@gmail.com>.
Neha,

apologies, I just re-read what I sent and realized my "you" wasn't specific enough - it meant the Kafka team ;). 

-- 
"If you can't conceal it well, expose it with all your might"
Alex Zuzin


On Friday, May 17, 2013 at 2:25 PM, Alex Zuzin wrote:

> Have you considered abstracting offset storage away so people could implement their own? 
> Would you take a patch if I'd stabbed at it, and if yes, what's the process (pardon the n00b)?
> 
> KCBO,
> -- 
> "If you can't conceal it well, expose it with all your might"
> Alex Zuzin
> 
> 
> On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:
> 
> > There is no particular need for storing the offsets in zookeeper. In fact
> > with Kafka 0.8, since partitions will be highly available, offsets could be
> > stored in Kafka topics. However, we haven't ironed out the design for this
> > yet.
> > 
> > Thanks,
> > Neha
> > 
> > 
> > On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <scott@heroku.com (mailto:scott@heroku.com)> wrote:
> > 
> > > afaik you dont 'have' to store the consumed offsets in zk right, this is
> > > only automatic with some of the clients?
> > > 
> > > why not store them in a data store that can write at the rate that you
> > > require?
> > > 
> > > 
> > > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)
> > > > wrote:
> > > 
> > > 
> > > > Update from our OPS team, regarding zookeeper 3.4.x. Given stability,
> > > > adoption of offset batching would be the only remaining bit of work to
> > > > 
> > > 
> > > do.
> > > > Still, I totally understand the restraint for 0.8...
> > > > 
> > > > 
> > > > "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> > > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > > Zookeeper and used it for the upgrade.
> > > > 
> > > > Kafka included version of Zookeeper 3.3.3.
> > > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > > 
> > > > Running, working great. I did *not* have to wipe out the zookeeper
> > > > databases. All data stayed intact.
> > > > 
> > > > I got a new feature, which allows auto-purging of logs. This keeps OPS
> > > > maintenance to a minimum."
> > > > 
> > > > 
> > > > thanks,
> > > > rob
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > > Sent: Friday, May 17, 2013 7:38 AM
> > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > 
> > > > Fair enough, this is something to look forward to. I appreciate the
> > > > restraint you show to stay out of troubled waters. :)
> > > > 
> > > > thanks,
> > > > rob
> > > > 
> > > > ________________________________________
> > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com)]
> > > > Sent: Friday, May 17, 2013 7:35 AM
> > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > 
> > > > Upgrading to a new zookeeper version is not an easy change. Also
> > > zookeeper
> > > > 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> > > > club 2 big changes together. So most likely this will be a post 08 item
> > > > 
> > > 
> > > for
> > > > stability purposes.
> > > > 
> > > > Thanks,
> > > > Neha
> > > > On May 17, 2013 6:31 AM, "Withers, Robert" <Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)>
> > > > wrote:
> > > > 
> > > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > > strong vote that this get tackled before a beta, to get it firmly into
> > > > > 
> > > > 
> > > > 0.8.
> > > > > Stabilize everything else to the existing use, but make offset updates
> > > > > batched.
> > > > > 
> > > > > thanks,
> > > > > rob
> > > > > ________________________________________
> > > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com)]
> > > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > 
> > > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > > > > is stable and released it will be worth looking into when we can use
> > > > > zookeeper 3.4.x.
> > > > > 
> > > > > Thanks,
> > > > > Neha
> > > > > On May 16, 2013 10:32 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)> wrote:
> > > > > 
> > > > > > Can a request be made to zookeeper for this feature?
> > > > > > 
> > > > > > Thanks,
> > > > > > rob
> > > > > > 
> > > > > > > -----Original Message-----
> > > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > > 
> > > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > > > > batch
> > > > > > > 
> > > > > > 
> > > > > > write
> > > > > > > api. So if you commit after every message at a high rate, it will
> > > > > > > be
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > slow
> > > > > > and
> > > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > > degrade.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)>
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > wrote:
> > > > > > > 
> > > > > > > > We are calling commitOffsets after every message consumption.
> > > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > > partitions, then commitOffsets sends 29 offset updates, correct?
> > > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > > > 
> > > > > > > > thanks,
> > > > > > > > rob
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> > 
> 
> 


Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Alex Zuzin <ca...@gmail.com>.
Have you considered abstracting offset storage away so people could implement their own? 
Would you take a patch if I'd stabbed at it, and if yes, what's the process (pardon the n00b)?

KCBO,
-- 
"If you can't conceal it well, expose it with all your might"
Alex Zuzin


On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:

> There is no particular need for storing the offsets in zookeeper. In fact
> with Kafka 0.8, since partitions will be highly available, offsets could be
> stored in Kafka topics. However, we haven't ironed out the design for this
> yet.
> 
> Thanks,
> Neha
> 
> 
> On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <scott@heroku.com (mailto:scott@heroku.com)> wrote:
> 
> > afaik you dont 'have' to store the consumed offsets in zk right, this is
> > only automatic with some of the clients?
> > 
> > why not store them in a data store that can write at the rate that you
> > require?
> > 
> > 
> > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)
> > > wrote:
> > 
> > 
> > > Update from our OPS team, regarding zookeeper 3.4.x. Given stability,
> > > adoption of offset batching would be the only remaining bit of work to
> > > 
> > 
> > do.
> > > Still, I totally understand the restraint for 0.8...
> > > 
> > > 
> > > "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > Zookeeper and used it for the upgrade.
> > > 
> > > Kafka included version of Zookeeper 3.3.3.
> > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > 
> > > Running, working great. I did *not* have to wipe out the zookeeper
> > > databases. All data stayed intact.
> > > 
> > > I got a new feature, which allows auto-purging of logs. This keeps OPS
> > > maintenance to a minimum."
> > > 
> > > 
> > > thanks,
> > > rob
> > > 
> > > 
> > > -----Original Message-----
> > > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > > Sent: Friday, May 17, 2013 7:38 AM
> > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > > 
> > > Fair enough, this is something to look forward to. I appreciate the
> > > restraint you show to stay out of troubled waters. :)
> > > 
> > > thanks,
> > > rob
> > > 
> > > ________________________________________
> > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com)]
> > > Sent: Friday, May 17, 2013 7:35 AM
> > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > > 
> > > Upgrading to a new zookeeper version is not an easy change. Also
> > zookeeper
> > > 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> > > club 2 big changes together. So most likely this will be a post 08 item
> > > 
> > 
> > for
> > > stability purposes.
> > > 
> > > Thanks,
> > > Neha
> > > On May 17, 2013 6:31 AM, "Withers, Robert" <Robert.Withers@dish.com (mailto:Robert.Withers@dish.com)>
> > > wrote:
> > > 
> > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > strong vote that this get tackled before a beta, to get it firmly into
> > > > 
> > > 
> > > 0.8.
> > > > Stabilize everything else to the existing use, but make offset updates
> > > > batched.
> > > > 
> > > > thanks,
> > > > rob
> > > > ________________________________________
> > > > From: Neha Narkhede [neha.narkhede@gmail.com (mailto:neha.narkhede@gmail.com)]
> > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > 
> > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > > > is stable and released it will be worth looking into when we can use
> > > > zookeeper 3.4.x.
> > > > 
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 10:32 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)> wrote:
> > > > 
> > > > > Can a request be made to zookeeper for this feature?
> > > > > 
> > > > > Thanks,
> > > > > rob
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > 
> > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > > > batch
> > > > > > 
> > > > > 
> > > > > write
> > > > > > api. So if you commit after every message at a high rate, it will
> > > > > > be
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > slow
> > > > > and
> > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > degrade.
> > > > > > 
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <reefedjib@gmail.com (mailto:reefedjib@gmail.com)>
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > wrote:
> > > > > > 
> > > > > > > We are calling commitOffsets after every message consumption.
> > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > partitions, then commitOffsets sends 29 offset updates, correct?
> > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > > 
> > > > > > > thanks,
> > > > > > > rob
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> 



Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
There is no particular need for storing the offsets in zookeeper. In fact
with Kafka 0.8, since partitions will be highly available, offsets could be
stored in Kafka topics. However, we haven't ironed out the design for this
yet.

Thanks,
Neha


On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <sc...@heroku.com> wrote:

> afaik you dont 'have' to store the consumed offsets in zk right, this is
> only automatic with some of the clients?
>
> why not store them in a data store that can write at the rate that you
> require?
>
>
> On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <Robert.Withers@dish.com
> >wrote:
>
> > Update from our OPS team, regarding zookeeper 3.4.x.  Given stability,
> > adoption of offset batching would be the only remaining bit of work to
> do.
> >  Still, I totally understand the restraint for 0.8...
> >
> >
> > "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > Zookeeper and used it for the upgrade.
> >
> > Kafka included version of Zookeeper 3.3.3.
> > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> >
> > Running, working great. I did *not* have to wipe out the zookeeper
> > databases. All data stayed intact.
> >
> > I got a new feature, which allows auto-purging of logs. This keeps OPS
> > maintenance to a minimum."
> >
> >
> > thanks,
> > rob
> >
> >
> > -----Original Message-----
> > From: Withers, Robert [mailto:Robert.Withers@dish.com]
> > Sent: Friday, May 17, 2013 7:38 AM
> > To: users@kafka.apache.org
> > Subject: RE: are commitOffsets botched to zookeeper?
> >
> > Fair enough, this is something to look forward to.  I appreciate the
> > restraint you show to stay out of troubled waters.  :)
> >
> > thanks,
> > rob
> >
> > ________________________________________
> > From: Neha Narkhede [neha.narkhede@gmail.com]
> > Sent: Friday, May 17, 2013 7:35 AM
> > To: users@kafka.apache.org
> > Subject: RE: are commitOffsets botched to zookeeper?
> >
> > Upgrading to a new zookeeper version is not an easy change. Also
> zookeeper
> > 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> > club 2 big changes together. So most likely this will be a post 08 item
> for
> > stability purposes.
> >
> > Thanks,
> > Neha
> > On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com>
> > wrote:
> >
> > > Awesome!  Thanks for the clarification.  I would like to offer my
> > > strong vote that this get tackled before a beta, to get it firmly into
> > 0.8.
> > > Stabilize everything else to the existing use, but make offset updates
> > > batched.
> > >
> > > thanks,
> > > rob
> > > ________________________________________
> > > From: Neha Narkhede [neha.narkhede@gmail.com]
> > > Sent: Friday, May 17, 2013 7:17 AM
> > > To: users@kafka.apache.org
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > >
> > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > > is stable and released it will be worth looking into when we can use
> > > zookeeper 3.4.x.
> > >
> > > Thanks,
> > > Neha
> > > On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
> > >
> > > > Can a request be made to zookeeper for this feature?
> > > >
> > > > Thanks,
> > > > rob
> > > >
> > > > > -----Original Message-----
> > > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > To: users@kafka.apache.org
> > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > >
> > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > > batch
> > > > write
> > > > > api. So if you commit after every message at a high rate, it will
> > > > > be
> > > slow
> > > > and
> > > > > inefficient. Besides it will cause zookeeper performance to
> degrade.
> > > > >
> > > > > Thanks,
> > > > > Neha
> > > > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com>
> wrote:
> > > > >
> > > > > > We are calling commitOffsets after every message consumption.
> > > > > > It looks to be ~60% slower, with 29 partitions.  If a single
> > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > partitions, then commitOffsets sends 29 offset updates, correct?
> > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > >
> > > > > > thanks,
> > > > > > rob
> > > >
> > > >
> > >
> >
>

Re: Update: RE: are commitOffsets botched to zookeeper?

Posted by Scott Clasen <sc...@heroku.com>.
afaik you dont 'have' to store the consumed offsets in zk right, this is
only automatic with some of the clients?

why not store them in a data store that can write at the rate that you
require?


On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <Ro...@dish.com>wrote:

> Update from our OPS team, regarding zookeeper 3.4.x.  Given stability,
> adoption of offset batching would be the only remaining bit of work to do.
>  Still, I totally understand the restraint for 0.8...
>
>
> "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> upgrade on Zookeeper. I downloaded a generic distribution of Apache
> Zookeeper and used it for the upgrade.
>
> Kafka included version of Zookeeper 3.3.3.
> Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
>
> Running, working great. I did *not* have to wipe out the zookeeper
> databases. All data stayed intact.
>
> I got a new feature, which allows auto-purging of logs. This keeps OPS
> maintenance to a minimum."
>
>
> thanks,
> rob
>
>
> -----Original Message-----
> From: Withers, Robert [mailto:Robert.Withers@dish.com]
> Sent: Friday, May 17, 2013 7:38 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Fair enough, this is something to look forward to.  I appreciate the
> restraint you show to stay out of troubled waters.  :)
>
> thanks,
> rob
>
> ________________________________________
> From: Neha Narkhede [neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:35 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Upgrading to a new zookeeper version is not an easy change. Also zookeeper
> 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> club 2 big changes together. So most likely this will be a post 08 item for
> stability purposes.
>
> Thanks,
> Neha
> On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com>
> wrote:
>
> > Awesome!  Thanks for the clarification.  I would like to offer my
> > strong vote that this get tackled before a beta, to get it firmly into
> 0.8.
> > Stabilize everything else to the existing use, but make offset updates
> > batched.
> >
> > thanks,
> > rob
> > ________________________________________
> > From: Neha Narkhede [neha.narkhede@gmail.com]
> > Sent: Friday, May 17, 2013 7:17 AM
> > To: users@kafka.apache.org
> > Subject: RE: are commitOffsets botched to zookeeper?
> >
> > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > is stable and released it will be worth looking into when we can use
> > zookeeper 3.4.x.
> >
> > Thanks,
> > Neha
> > On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
> >
> > > Can a request be made to zookeeper for this feature?
> > >
> > > Thanks,
> > > rob
> > >
> > > > -----Original Message-----
> > > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > To: users@kafka.apache.org
> > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > >
> > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > batch
> > > write
> > > > api. So if you commit after every message at a high rate, it will
> > > > be
> > slow
> > > and
> > > > inefficient. Besides it will cause zookeeper performance to degrade.
> > > >
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > > >
> > > > > We are calling commitOffsets after every message consumption.
> > > > > It looks to be ~60% slower, with 29 partitions.  If a single
> > > > > KafkaStream thread is from a connector, and there are 29
> > > > > partitions, then commitOffsets sends 29 offset updates, correct?
> > > > > Are these offset updates batched in one send to zookeeper?
> > > > >
> > > > > thanks,
> > > > > rob
> > >
> > >
> >
>

Update: RE: are commitOffsets botched to zookeeper?

Posted by "Withers, Robert" <Ro...@dish.com>.
Update from our OPS team, regarding zookeeper 3.4.x.  Given stability, adoption of offset batching would be the only remaining bit of work to do.  Still, I totally understand the restraint for 0.8...


"As exercise in upgradability of zookeeper, I did a "out-of-the"box" upgrade on Zookeeper. I downloaded a generic distribution of Apache Zookeeper and used it for the upgrade.

Kafka included version of Zookeeper 3.3.3.
Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)

Running, working great. I did *not* have to wipe out the zookeeper databases. All data stayed intact.

I got a new feature, which allows auto-purging of logs. This keeps OPS maintenance to a minimum."


thanks,
rob


-----Original Message-----
From: Withers, Robert [mailto:Robert.Withers@dish.com] 
Sent: Friday, May 17, 2013 7:38 AM
To: users@kafka.apache.org
Subject: RE: are commitOffsets botched to zookeeper?

Fair enough, this is something to look forward to.  I appreciate the restraint you show to stay out of troubled waters.  :)

thanks,
rob

________________________________________
From: Neha Narkhede [neha.narkhede@gmail.com]
Sent: Friday, May 17, 2013 7:35 AM
To: users@kafka.apache.org
Subject: RE: are commitOffsets botched to zookeeper?

Upgrading to a new zookeeper version is not an easy change. Also zookeeper
3.3.4 is much more stable compared to 3.4.x. We think it is better not to club 2 big changes together. So most likely this will be a post 08 item for stability purposes.

Thanks,
Neha
On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com> wrote:

> Awesome!  Thanks for the clarification.  I would like to offer my 
> strong vote that this get tackled before a beta, to get it firmly into 0.8.
> Stabilize everything else to the existing use, but make offset updates 
> batched.
>
> thanks,
> rob
> ________________________________________
> From: Neha Narkhede [neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:17 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 
> is stable and released it will be worth looking into when we can use 
> zookeeper 3.4.x.
>
> Thanks,
> Neha
> On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
>
> > Can a request be made to zookeeper for this feature?
> >
> > Thanks,
> > rob
> >
> > > -----Original Message-----
> > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > Sent: Thursday, May 16, 2013 9:53 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: are commitOffsets botched to zookeeper?
> > >
> > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a 
> > > batch
> > write
> > > api. So if you commit after every message at a high rate, it will 
> > > be
> slow
> > and
> > > inefficient. Besides it will cause zookeeper performance to degrade.
> > >
> > > Thanks,
> > > Neha
> > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > >
> > > > We are calling commitOffsets after every message consumption.  
> > > > It looks to be ~60% slower, with 29 partitions.  If a single 
> > > > KafkaStream thread is from a connector, and there are 29 
> > > > partitions, then commitOffsets sends 29 offset updates, correct?  
> > > > Are these offset updates batched in one send to zookeeper?
> > > >
> > > > thanks,
> > > > rob
> >
> >
>

RE: are commitOffsets botched to zookeeper?

Posted by "Withers, Robert" <Ro...@dish.com>.
Fair enough, this is something to look forward to.  I appreciate the restraint you show to stay out of troubled waters.  :)

thanks,
rob

________________________________________
From: Neha Narkhede [neha.narkhede@gmail.com]
Sent: Friday, May 17, 2013 7:35 AM
To: users@kafka.apache.org
Subject: RE: are commitOffsets botched to zookeeper?

Upgrading to a new zookeeper version is not an easy change. Also zookeeper
3.3.4 is much more stable compared to 3.4.x. We think it is better not to
club 2 big changes together. So most likely this will be a post 08 item for
stability purposes.

Thanks,
Neha
On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com> wrote:

> Awesome!  Thanks for the clarification.  I would like to offer my strong
> vote that this get tackled before a beta, to get it firmly into 0.8.
> Stabilize everything else to the existing use, but make offset updates
> batched.
>
> thanks,
> rob
> ________________________________________
> From: Neha Narkhede [neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:17 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
> stable and released it will be worth looking into when we can use zookeeper
> 3.4.x.
>
> Thanks,
> Neha
> On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
>
> > Can a request be made to zookeeper for this feature?
> >
> > Thanks,
> > rob
> >
> > > -----Original Message-----
> > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > Sent: Thursday, May 16, 2013 9:53 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: are commitOffsets botched to zookeeper?
> > >
> > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> > write
> > > api. So if you commit after every message at a high rate, it will be
> slow
> > and
> > > inefficient. Besides it will cause zookeeper performance to degrade.
> > >
> > > Thanks,
> > > Neha
> > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > >
> > > > We are calling commitOffsets after every message consumption.  It
> > > > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > > > thread is from a connector, and there are 29 partitions, then
> > > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > > updates batched in one send to zookeeper?
> > > >
> > > > thanks,
> > > > rob
> >
> >
>

RE: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
Upgrading to a new zookeeper version is not an easy change. Also zookeeper
3.3.4 is much more stable compared to 3.4.x. We think it is better not to
club 2 big changes together. So most likely this will be a post 08 item for
stability purposes.

Thanks,
Neha
On May 17, 2013 6:31 AM, "Withers, Robert" <Ro...@dish.com> wrote:

> Awesome!  Thanks for the clarification.  I would like to offer my strong
> vote that this get tackled before a beta, to get it firmly into 0.8.
> Stabilize everything else to the existing use, but make offset updates
> batched.
>
> thanks,
> rob
> ________________________________________
> From: Neha Narkhede [neha.narkhede@gmail.com]
> Sent: Friday, May 17, 2013 7:17 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
> stable and released it will be worth looking into when we can use zookeeper
> 3.4.x.
>
> Thanks,
> Neha
> On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:
>
> > Can a request be made to zookeeper for this feature?
> >
> > Thanks,
> > rob
> >
> > > -----Original Message-----
> > > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > > Sent: Thursday, May 16, 2013 9:53 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: are commitOffsets botched to zookeeper?
> > >
> > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> > write
> > > api. So if you commit after every message at a high rate, it will be
> slow
> > and
> > > inefficient. Besides it will cause zookeeper performance to degrade.
> > >
> > > Thanks,
> > > Neha
> > > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> > >
> > > > We are calling commitOffsets after every message consumption.  It
> > > > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > > > thread is from a connector, and there are 29 partitions, then
> > > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > > updates batched in one send to zookeeper?
> > > >
> > > > thanks,
> > > > rob
> >
> >
>

RE: are commitOffsets botched to zookeeper?

Posted by "Withers, Robert" <Ro...@dish.com>.
Awesome!  Thanks for the clarification.  I would like to offer my strong vote that this get tackled before a beta, to get it firmly into 0.8.   Stabilize everything else to the existing use, but make offset updates batched.

thanks,
rob
________________________________________
From: Neha Narkhede [neha.narkhede@gmail.com]
Sent: Friday, May 17, 2013 7:17 AM
To: users@kafka.apache.org
Subject: RE: are commitOffsets botched to zookeeper?

Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
stable and released it will be worth looking into when we can use zookeeper
3.4.x.

Thanks,
Neha
On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:

> Can a request be made to zookeeper for this feature?
>
> Thanks,
> rob
>
> > -----Original Message-----
> > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > Sent: Thursday, May 16, 2013 9:53 PM
> > To: users@kafka.apache.org
> > Subject: Re: are commitOffsets botched to zookeeper?
> >
> > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> write
> > api. So if you commit after every message at a high rate, it will be slow
> and
> > inefficient. Besides it will cause zookeeper performance to degrade.
> >
> > Thanks,
> > Neha
> > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> >
> > > We are calling commitOffsets after every message consumption.  It
> > > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > > thread is from a connector, and there are 29 partitions, then
> > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > updates batched in one send to zookeeper?
> > >
> > > thanks,
> > > rob
>
>

RE: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08 is
stable and released it will be worth looking into when we can use zookeeper
3.4.x.

Thanks,
Neha
On May 16, 2013 10:32 PM, "Rob Withers" <re...@gmail.com> wrote:

> Can a request be made to zookeeper for this feature?
>
> Thanks,
> rob
>
> > -----Original Message-----
> > From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> > Sent: Thursday, May 16, 2013 9:53 PM
> > To: users@kafka.apache.org
> > Subject: Re: are commitOffsets botched to zookeeper?
> >
> > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch
> write
> > api. So if you commit after every message at a high rate, it will be slow
> and
> > inefficient. Besides it will cause zookeeper performance to degrade.
> >
> > Thanks,
> > Neha
> > On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> >
> > > We are calling commitOffsets after every message consumption.  It
> > > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > > thread is from a connector, and there are 29 partitions, then
> > > commitOffsets sends 29 offset updates, correct?  Are these offset
> > > updates batched in one send to zookeeper?
> > >
> > > thanks,
> > > rob
>
>

RE: are commitOffsets botched to zookeeper?

Posted by Rob Withers <re...@gmail.com>.
Can a request be made to zookeeper for this feature?

Thanks,
rob

> -----Original Message-----
> From: Neha Narkhede [mailto:neha.narkhede@gmail.com]
> Sent: Thursday, May 16, 2013 9:53 PM
> To: users@kafka.apache.org
> Subject: Re: are commitOffsets botched to zookeeper?
> 
> Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch write
> api. So if you commit after every message at a high rate, it will be slow
and
> inefficient. Besides it will cause zookeeper performance to degrade.
> 
> Thanks,
> Neha
> On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:
> 
> > We are calling commitOffsets after every message consumption.  It
> > looks to be ~60% slower, with 29 partitions.  If a single KafkaStream
> > thread is from a connector, and there are 29 partitions, then
> > commitOffsets sends 29 offset updates, correct?  Are these offset
> > updates batched in one send to zookeeper?
> >
> > thanks,
> > rob


Re: are commitOffsets botched to zookeeper?

Posted by Neha Narkhede <ne...@gmail.com>.
Currently Kafka depends on zookeeper 3.3.4 that doesn't have a batch write
api. So if you commit after every message at a high rate, it will be slow
and inefficient. Besides it will cause zookeeper performance to degrade.

Thanks,
Neha
On May 16, 2013 6:54 PM, "Rob Withers" <re...@gmail.com> wrote:

> We are calling commitOffsets after every message consumption.  It looks to
> be ~60% slower, with 29 partitions.  If a single KafkaStream thread is from
> a connector, and there are 29 partitions, then commitOffsets sends 29
> offset updates, correct?  Are these offset updates batched in one send to
> zookeeper?
>
> thanks,
> rob