You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Cliff Rhyne <cr...@signal.co> on 2016/05/04 20:43:25 UTC

list of challenges encountered using 0.9.0.1

While at the Kafka Summit I was asked to write up a list of challenges and
confusions my team encountered using Kafka.  We are using 0.9.0.1 and use
the new Java KakfaConsumer.


   1. The new Java KafkaConsumer doesn’t have a method to return the high
   watermark (last offset in the topic/partition's log.
   2. Can’t connect using the Java client to just check status on topics
   (committed offset for different consumer groups, high watermark, etc)
   3. kafka-consumer-groups.sh requires a member of the consumer group to
   be connected and consuming or offset values won't be displayed (artificial
   prerequisite)
   4. Default config for tracking committed offsets is poor (commits should
   be very permanent shouldn’t age out after 24 hours).
   5. It should not be possible to set an offset.retention time <
   log.retention time.
   6. Consumer group rebalances affect all consumers across all topics
   within the consumer group including topics without a new subscriber.
   7. Changing the broker config requires a 1-at-a-time roll of all the
   cluster, a service kafka reload would be nice.
   8. Console consumer still uses “old” consumer style configuration
   options (--zookeeper). This is a bit strange for anyone who has started
   using Kafka with version 0.9 or later, since the cli options don’t
   correspond to what you expect the consumer to need.
   9. Heartbeat only on poll() causes problems when we have gaps in
   consuming before committing (such as when we publish files and don’t want
   to commit until the publish is complete).  Supposedly position() will
   perform a heartbeat too in addition to poll() (I haven’t verified this but
   heard it at the Kafka Summit), but it does add extra complexity to the
   application.


Thanks for listening,
Cliff Rhyne

-- 
Cliff Rhyne
Software Engineering Manager
e: crhyne@signal.co
signal.co
________________________

Cut Through the Noise

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use of this email is strictly prohibited.
©2016 Signal. All rights reserved.

Re: list of challenges encountered using 0.9.0.1

Posted by Cliff Rhyne <cr...@signal.co>.
Thanks for the updates, Jason.  Let me know if you have questions about use
case that brings up any of these scenarios.

On Thu, May 5, 2016 at 7:34 PM, Jason Gustafson <ja...@confluent.io> wrote:

> Hey Cliff,
>
> Thanks for the feedback. A few select comments:
>
> 1. The new Java KafkaConsumer doesn’t have a method to return the high
> >    watermark (last offset in the topic/partition's log.
>
>
> This is currently exposed in fetch responses, so we could add it to the
> ConsumerRecords object. In general we've so far avoided exposing offset
> APIs only because we haven't had time to think them through. My feeling is
> that the rest of the API is becoming stable, so this will likely become a
> major focus in the next release.
>
>    2. Can’t connect using the Java client to just check status on topics
> >    (committed offset for different consumer groups, high watermark, etc)
>
>
> This is definitely a gap. I think the idea in KIP-4 (which I'm really
> hoping will be completed in the next release) is to expose an AdminClient
> in kafka-clients which contains this kind of access.
>
>  3. kafka-consumer-groups.sh requires a member of the consumer group to
> >    be connected and consuming or offset values won't be displayed
> > (artificial
> >    prerequisite)
>
>
> Yes! I have felt this annoyance as well. I've been working on a patch for
> this problem, but I'm not sure if it can get into 0.10. The problem is
> basically that there is an inconsistency between how long we retain offsets
> and group metadata (such as members and assignments). Because of this, it's
> difficult to tell whether the state of the group has actually been removed.
>
>  6. Consumer group rebalances affect all consumers across all topics
> >    within the consumer group including topics without a new subscriber.
>
>
> We've discussed a few options for partial rebalancing, but it's tough to
> come up with a proposal which doesn't add a lot of complication to the
> group management protocol. I'd love to see a solution for this as well, but
> I think we need a simple approach to get broad support. There is an active
> KIP for sticky partition assignment which might help somewhat with this
> problem. The basic idea would be to optimistically continue processing
> while a rebalance is taking place under the expectation that most of the
> partitions would continue to be owned by the consumer after the rebalance
> completes. We need to work through the usage though to see if this makes
> sense.
>
> 9. Heartbeat only on poll() causes problems when we have gaps in
> >    consuming before committing (such as when we publish files and don’t
> > want
> >    to commit until the publish is complete).  Supposedly position() will
> >    perform a heartbeat too in addition to poll() (I haven’t verified this
> > but
> >    heard it at the Kafka Summit), but it does add extra complexity to the
> >    application.
>
>
> I think in general we've underestimated the number of use cases where it's
> difficult to put a bound on processing time. Although max.poll.records
> solves part of the problem (by making processing time more predictable),
> it's still difficult generally to figure out what this bound is. It's
> particularly a big problem for frameworks (such as Streams and Connect)
> where we don't directly control the processing time. I consider it very
> likely in the next iteration that we will either 1) add a background thread
> to the consumer for asynchronous heartbeating or 2) expose an API to make
> it easy for users to do the same thing.
>
>
> Thanks,
> Jason
>
> On Wed, May 4, 2016 at 1:43 PM, Cliff Rhyne <cr...@signal.co> wrote:
>
> > While at the Kafka Summit I was asked to write up a list of challenges
> and
> > confusions my team encountered using Kafka.  We are using 0.9.0.1 and use
> > the new Java KakfaConsumer.
> >
> >
> >    1. The new Java KafkaConsumer doesn’t have a method to return the high
> >    watermark (last offset in the topic/partition's log.
> >    2. Can’t connect using the Java client to just check status on topics
> >    (committed offset for different consumer groups, high watermark, etc)
> >    3. kafka-consumer-groups.sh requires a member of the consumer group to
> >    be connected and consuming or offset values won't be displayed
> > (artificial
> >    prerequisite)
> >    4. Default config for tracking committed offsets is poor (commits
> should
> >    be very permanent shouldn’t age out after 24 hours).
> >    5. It should not be possible to set an offset.retention time <
> >    log.retention time.
> >    6. Consumer group rebalances affect all consumers across all topics
> >    within the consumer group including topics without a new subscriber.
> >    7. Changing the broker config requires a 1-at-a-time roll of all the
> >    cluster, a service kafka reload would be nice.
> >    8. Console consumer still uses “old” consumer style configuration
> >    options (--zookeeper). This is a bit strange for anyone who has
> started
> >    using Kafka with version 0.9 or later, since the cli options don’t
> >    correspond to what you expect the consumer to need.
> >    9. Heartbeat only on poll() causes problems when we have gaps in
> >    consuming before committing (such as when we publish files and don’t
> > want
> >    to commit until the publish is complete).  Supposedly position() will
> >    perform a heartbeat too in addition to poll() (I haven’t verified this
> > but
> >    heard it at the Kafka Summit), but it does add extra complexity to the
> >    application.
> >
> >
> > Thanks for listening,
> > Cliff Rhyne
> >
> > --
> > Cliff Rhyne
> > Software Engineering Manager
> > e: crhyne@signal.co
> > signal.co
> > ________________________
> >
> > Cut Through the Noise
> >
> > This e-mail and any files transmitted with it are for the sole use of the
> > intended recipient(s) and may contain confidential and privileged
> > information. Any unauthorized use of this email is strictly prohibited.
> > ©2016 Signal. All rights reserved.
> >
>



-- 
Cliff Rhyne
Software Engineering Manager
e: crhyne@signal.co
signal.co
________________________

Cut Through the Noise

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use of this email is strictly prohibited.
©2016 Signal. All rights reserved.

Re: list of challenges encountered using 0.9.0.1

Posted by Jason Gustafson <ja...@confluent.io>.
Hey Cliff,

Thanks for the feedback. A few select comments:

1. The new Java KafkaConsumer doesn’t have a method to return the high
>    watermark (last offset in the topic/partition's log.


This is currently exposed in fetch responses, so we could add it to the
ConsumerRecords object. In general we've so far avoided exposing offset
APIs only because we haven't had time to think them through. My feeling is
that the rest of the API is becoming stable, so this will likely become a
major focus in the next release.

   2. Can’t connect using the Java client to just check status on topics
>    (committed offset for different consumer groups, high watermark, etc)


This is definitely a gap. I think the idea in KIP-4 (which I'm really
hoping will be completed in the next release) is to expose an AdminClient
in kafka-clients which contains this kind of access.

 3. kafka-consumer-groups.sh requires a member of the consumer group to
>    be connected and consuming or offset values won't be displayed
> (artificial
>    prerequisite)


Yes! I have felt this annoyance as well. I've been working on a patch for
this problem, but I'm not sure if it can get into 0.10. The problem is
basically that there is an inconsistency between how long we retain offsets
and group metadata (such as members and assignments). Because of this, it's
difficult to tell whether the state of the group has actually been removed.

 6. Consumer group rebalances affect all consumers across all topics
>    within the consumer group including topics without a new subscriber.


We've discussed a few options for partial rebalancing, but it's tough to
come up with a proposal which doesn't add a lot of complication to the
group management protocol. I'd love to see a solution for this as well, but
I think we need a simple approach to get broad support. There is an active
KIP for sticky partition assignment which might help somewhat with this
problem. The basic idea would be to optimistically continue processing
while a rebalance is taking place under the expectation that most of the
partitions would continue to be owned by the consumer after the rebalance
completes. We need to work through the usage though to see if this makes
sense.

9. Heartbeat only on poll() causes problems when we have gaps in
>    consuming before committing (such as when we publish files and don’t
> want
>    to commit until the publish is complete).  Supposedly position() will
>    perform a heartbeat too in addition to poll() (I haven’t verified this
> but
>    heard it at the Kafka Summit), but it does add extra complexity to the
>    application.


I think in general we've underestimated the number of use cases where it's
difficult to put a bound on processing time. Although max.poll.records
solves part of the problem (by making processing time more predictable),
it's still difficult generally to figure out what this bound is. It's
particularly a big problem for frameworks (such as Streams and Connect)
where we don't directly control the processing time. I consider it very
likely in the next iteration that we will either 1) add a background thread
to the consumer for asynchronous heartbeating or 2) expose an API to make
it easy for users to do the same thing.


Thanks,
Jason

On Wed, May 4, 2016 at 1:43 PM, Cliff Rhyne <cr...@signal.co> wrote:

> While at the Kafka Summit I was asked to write up a list of challenges and
> confusions my team encountered using Kafka.  We are using 0.9.0.1 and use
> the new Java KakfaConsumer.
>
>
>    1. The new Java KafkaConsumer doesn’t have a method to return the high
>    watermark (last offset in the topic/partition's log.
>    2. Can’t connect using the Java client to just check status on topics
>    (committed offset for different consumer groups, high watermark, etc)
>    3. kafka-consumer-groups.sh requires a member of the consumer group to
>    be connected and consuming or offset values won't be displayed
> (artificial
>    prerequisite)
>    4. Default config for tracking committed offsets is poor (commits should
>    be very permanent shouldn’t age out after 24 hours).
>    5. It should not be possible to set an offset.retention time <
>    log.retention time.
>    6. Consumer group rebalances affect all consumers across all topics
>    within the consumer group including topics without a new subscriber.
>    7. Changing the broker config requires a 1-at-a-time roll of all the
>    cluster, a service kafka reload would be nice.
>    8. Console consumer still uses “old” consumer style configuration
>    options (--zookeeper). This is a bit strange for anyone who has started
>    using Kafka with version 0.9 or later, since the cli options don’t
>    correspond to what you expect the consumer to need.
>    9. Heartbeat only on poll() causes problems when we have gaps in
>    consuming before committing (such as when we publish files and don’t
> want
>    to commit until the publish is complete).  Supposedly position() will
>    perform a heartbeat too in addition to poll() (I haven’t verified this
> but
>    heard it at the Kafka Summit), but it does add extra complexity to the
>    application.
>
>
> Thanks for listening,
> Cliff Rhyne
>
> --
> Cliff Rhyne
> Software Engineering Manager
> e: crhyne@signal.co
> signal.co
> ________________________
>
> Cut Through the Noise
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized use of this email is strictly prohibited.
> ©2016 Signal. All rights reserved.
>