You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Stevo Slavić <ss...@gmail.com> on 2015/04/22 08:36:14 UTC

Offset management: client vs broker side responsibility

Hello Apache Kafka community,

Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
management responsibility is mainly client/consumer side responsibility.

Wouldn't it be better if it was broker side only responsibility?

E.g. now if one wants to use custom offset management, any of the Kafka
monitoring tools cannot see the offsets - they would need to use same
custom client implementation which is practically not possible.

Kind regards,
Stevo Slavic.

Re: Offset management: client vs broker side responsibility

Posted by Stevo Slavić <ss...@gmail.com>.
Found out that there is standard API for retrieving and committing offsets
(see
https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
)

Problem is that the server/broker side is not extensible (see
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L142
) - i.e. there is no API one can implement and deploy/configure together
with Kafka binary with support for handling unsupported or overriding
handling of already supported
offsetCommitRequest.versionId/offsetFetchRequest.versionId

It does not prevent one to implement custom offset management on client
side (instead of using standard API to commit and retrieve offsets, one can
directly talk with custom offset store) but then problem arises that no
commercial or FOSS kafka monitoring solution support it out of the box.

I know I would, but the question to Apache Kafka community is would you
like to have Kafka broker commit/fetch extensible, and then also what
committers think about this?

Kind regards,
Stevo Slavic.


On Tue, Jun 2, 2015 at 7:11 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> Hi,
>
> I haven't followed the changes to offset tracking closely, other than that
> storing them in ZK is not the only option any more.
> I think what Stevo is asking about/suggesting is that there there be a
> single API from which offset information can be retrieved (e.g. by
> monitoring tools), so that monitoring tools work regardless of where one
> chose to store offsets.
> I know we'd love to have this for SPM's Kafka monitoring and can tell you
> that adding support for N different APIs for N different offset storage
> systems would be hard/time-consuming/expensive.
> But maybe this single API already exists?
>
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Mon, Jun 1, 2015 at 4:41 PM, Jason Rosenberg <jb...@squareup.com> wrote:
>
> > Stevo,
> >
> > Both of the main solutions used by the high-level consumer are
> standardized
> > and supported directly by the kafka client libraries (e.g. maintaining
> > offsets in zookeeper or in kafka itself).  And for the zk case, there is
> > the consumer offset checker (which is good for monitoring).  Consumer
> > offset checker still needs to be extended for offsets stored in kafka
> > _consumer_offset topics though.
> >
> > Anyway, I'm not sure I understand your question, you want something for
> > better monitoring of all possible clients (some of which might choose to
> > manage offsets in their own way)?
> >
> > It's just not part of the kafka design to directly track individual
> > consumers.
> >
> > Jason
> >
> > On Wed, May 27, 2015 at 7:42 AM, Shady Xu <sh...@gmail.com> wrote:
> >
> > > I guess adding a new component will increase the complexity of the
> system
> > > structure. And if the new component consists of one or a few nodes, it
> > may
> > > becomes the bottleneck of the whole system, if it consists of many
> nodes,
> > > it will make the system even more complex.
> > >
> > > Although every solution has its downsides, I think the current one is
> > > decent.
> > >
> > > 2015-05-27 17:10 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> > >
> > > > It could be a separate server component, does not have to be
> > > > monolith/coupled with broker.
> > > > Such solution would have benefits - single API, pluggable
> > > implementations.
> > > >
> > > > On Wed, May 27, 2015 at 8:57 AM, Shady Xu <sh...@gmail.com> wrote:
> > > >
> > > > > Storing and managing offsets by broker will leave high pressure on
> > the
> > > > > brokers which will affect the performance of the cluster.
> > > > >
> > > > > You can use the advanced consumer APIs, then you can get the
> offsets
> > > > either
> > > > > from zookeeper or the __consumer_offsets__ topic. On the other
> hand,
> > if
> > > > you
> > > > > use the simple consumer APIs, you mean to manage offsets yourself,
> > then
> > > > you
> > > > > should monitor them yourself, simple and plain, right?
> > > > >
> > > > > 2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> > > > >
> > > > > > Hello Apache Kafka community,
> > > > > >
> > > > > > Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x)
> offset
> > > > > > management responsibility is mainly client/consumer side
> > > > responsibility.
> > > > > >
> > > > > > Wouldn't it be better if it was broker side only responsibility?
> > > > > >
> > > > > > E.g. now if one wants to use custom offset management, any of the
> > > Kafka
> > > > > > monitoring tools cannot see the offsets - they would need to use
> > same
> > > > > > custom client implementation which is practically not possible.
> > > > > >
> > > > > > Kind regards,
> > > > > > Stevo Slavic.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Offset management: client vs broker side responsibility

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

I haven't followed the changes to offset tracking closely, other than that
storing them in ZK is not the only option any more.
I think what Stevo is asking about/suggesting is that there there be a
single API from which offset information can be retrieved (e.g. by
monitoring tools), so that monitoring tools work regardless of where one
chose to store offsets.
I know we'd love to have this for SPM's Kafka monitoring and can tell you
that adding support for N different APIs for N different offset storage
systems would be hard/time-consuming/expensive.
But maybe this single API already exists?

Thanks,
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Jun 1, 2015 at 4:41 PM, Jason Rosenberg <jb...@squareup.com> wrote:

> Stevo,
>
> Both of the main solutions used by the high-level consumer are standardized
> and supported directly by the kafka client libraries (e.g. maintaining
> offsets in zookeeper or in kafka itself).  And for the zk case, there is
> the consumer offset checker (which is good for monitoring).  Consumer
> offset checker still needs to be extended for offsets stored in kafka
> _consumer_offset topics though.
>
> Anyway, I'm not sure I understand your question, you want something for
> better monitoring of all possible clients (some of which might choose to
> manage offsets in their own way)?
>
> It's just not part of the kafka design to directly track individual
> consumers.
>
> Jason
>
> On Wed, May 27, 2015 at 7:42 AM, Shady Xu <sh...@gmail.com> wrote:
>
> > I guess adding a new component will increase the complexity of the system
> > structure. And if the new component consists of one or a few nodes, it
> may
> > becomes the bottleneck of the whole system, if it consists of many nodes,
> > it will make the system even more complex.
> >
> > Although every solution has its downsides, I think the current one is
> > decent.
> >
> > 2015-05-27 17:10 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> >
> > > It could be a separate server component, does not have to be
> > > monolith/coupled with broker.
> > > Such solution would have benefits - single API, pluggable
> > implementations.
> > >
> > > On Wed, May 27, 2015 at 8:57 AM, Shady Xu <sh...@gmail.com> wrote:
> > >
> > > > Storing and managing offsets by broker will leave high pressure on
> the
> > > > brokers which will affect the performance of the cluster.
> > > >
> > > > You can use the advanced consumer APIs, then you can get the offsets
> > > either
> > > > from zookeeper or the __consumer_offsets__ topic. On the other hand,
> if
> > > you
> > > > use the simple consumer APIs, you mean to manage offsets yourself,
> then
> > > you
> > > > should monitor them yourself, simple and plain, right?
> > > >
> > > > 2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> > > >
> > > > > Hello Apache Kafka community,
> > > > >
> > > > > Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
> > > > > management responsibility is mainly client/consumer side
> > > responsibility.
> > > > >
> > > > > Wouldn't it be better if it was broker side only responsibility?
> > > > >
> > > > > E.g. now if one wants to use custom offset management, any of the
> > Kafka
> > > > > monitoring tools cannot see the offsets - they would need to use
> same
> > > > > custom client implementation which is practically not possible.
> > > > >
> > > > > Kind regards,
> > > > > Stevo Slavic.
> > > > >
> > > >
> > >
> >
>

Re: Offset management: client vs broker side responsibility

Posted by Jason Rosenberg <jb...@squareup.com>.
Stevo,

Both of the main solutions used by the high-level consumer are standardized
and supported directly by the kafka client libraries (e.g. maintaining
offsets in zookeeper or in kafka itself).  And for the zk case, there is
the consumer offset checker (which is good for monitoring).  Consumer
offset checker still needs to be extended for offsets stored in kafka
_consumer_offset topics though.

Anyway, I'm not sure I understand your question, you want something for
better monitoring of all possible clients (some of which might choose to
manage offsets in their own way)?

It's just not part of the kafka design to directly track individual
consumers.

Jason

On Wed, May 27, 2015 at 7:42 AM, Shady Xu <sh...@gmail.com> wrote:

> I guess adding a new component will increase the complexity of the system
> structure. And if the new component consists of one or a few nodes, it may
> becomes the bottleneck of the whole system, if it consists of many nodes,
> it will make the system even more complex.
>
> Although every solution has its downsides, I think the current one is
> decent.
>
> 2015-05-27 17:10 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
>
> > It could be a separate server component, does not have to be
> > monolith/coupled with broker.
> > Such solution would have benefits - single API, pluggable
> implementations.
> >
> > On Wed, May 27, 2015 at 8:57 AM, Shady Xu <sh...@gmail.com> wrote:
> >
> > > Storing and managing offsets by broker will leave high pressure on the
> > > brokers which will affect the performance of the cluster.
> > >
> > > You can use the advanced consumer APIs, then you can get the offsets
> > either
> > > from zookeeper or the __consumer_offsets__ topic. On the other hand, if
> > you
> > > use the simple consumer APIs, you mean to manage offsets yourself, then
> > you
> > > should monitor them yourself, simple and plain, right?
> > >
> > > 2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> > >
> > > > Hello Apache Kafka community,
> > > >
> > > > Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
> > > > management responsibility is mainly client/consumer side
> > responsibility.
> > > >
> > > > Wouldn't it be better if it was broker side only responsibility?
> > > >
> > > > E.g. now if one wants to use custom offset management, any of the
> Kafka
> > > > monitoring tools cannot see the offsets - they would need to use same
> > > > custom client implementation which is practically not possible.
> > > >
> > > > Kind regards,
> > > > Stevo Slavic.
> > > >
> > >
> >
>

Re: Offset management: client vs broker side responsibility

Posted by Shady Xu <sh...@gmail.com>.
I guess adding a new component will increase the complexity of the system
structure. And if the new component consists of one or a few nodes, it may
becomes the bottleneck of the whole system, if it consists of many nodes,
it will make the system even more complex.

Although every solution has its downsides, I think the current one is
decent.

2015-05-27 17:10 GMT+08:00 Stevo Slavić <ss...@gmail.com>:

> It could be a separate server component, does not have to be
> monolith/coupled with broker.
> Such solution would have benefits - single API, pluggable implementations.
>
> On Wed, May 27, 2015 at 8:57 AM, Shady Xu <sh...@gmail.com> wrote:
>
> > Storing and managing offsets by broker will leave high pressure on the
> > brokers which will affect the performance of the cluster.
> >
> > You can use the advanced consumer APIs, then you can get the offsets
> either
> > from zookeeper or the __consumer_offsets__ topic. On the other hand, if
> you
> > use the simple consumer APIs, you mean to manage offsets yourself, then
> you
> > should monitor them yourself, simple and plain, right?
> >
> > 2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
> >
> > > Hello Apache Kafka community,
> > >
> > > Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
> > > management responsibility is mainly client/consumer side
> responsibility.
> > >
> > > Wouldn't it be better if it was broker side only responsibility?
> > >
> > > E.g. now if one wants to use custom offset management, any of the Kafka
> > > monitoring tools cannot see the offsets - they would need to use same
> > > custom client implementation which is practically not possible.
> > >
> > > Kind regards,
> > > Stevo Slavic.
> > >
> >
>

Re: Offset management: client vs broker side responsibility

Posted by Stevo Slavić <ss...@gmail.com>.
It could be a separate server component, does not have to be
monolith/coupled with broker.
Such solution would have benefits - single API, pluggable implementations.

On Wed, May 27, 2015 at 8:57 AM, Shady Xu <sh...@gmail.com> wrote:

> Storing and managing offsets by broker will leave high pressure on the
> brokers which will affect the performance of the cluster.
>
> You can use the advanced consumer APIs, then you can get the offsets either
> from zookeeper or the __consumer_offsets__ topic. On the other hand, if you
> use the simple consumer APIs, you mean to manage offsets yourself, then you
> should monitor them yourself, simple and plain, right?
>
> 2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:
>
> > Hello Apache Kafka community,
> >
> > Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
> > management responsibility is mainly client/consumer side responsibility.
> >
> > Wouldn't it be better if it was broker side only responsibility?
> >
> > E.g. now if one wants to use custom offset management, any of the Kafka
> > monitoring tools cannot see the offsets - they would need to use same
> > custom client implementation which is practically not possible.
> >
> > Kind regards,
> > Stevo Slavic.
> >
>

Re: Offset management: client vs broker side responsibility

Posted by Shady Xu <sh...@gmail.com>.
Storing and managing offsets by broker will leave high pressure on the
brokers which will affect the performance of the cluster.

You can use the advanced consumer APIs, then you can get the offsets either
from zookeeper or the __consumer_offsets__ topic. On the other hand, if you
use the simple consumer APIs, you mean to manage offsets yourself, then you
should monitor them yourself, simple and plain, right?

2015-04-22 14:36 GMT+08:00 Stevo Slavić <ss...@gmail.com>:

> Hello Apache Kafka community,
>
> Please correct me if wrong, AFAIK currently (Kafka 0.8.2.x) offset
> management responsibility is mainly client/consumer side responsibility.
>
> Wouldn't it be better if it was broker side only responsibility?
>
> E.g. now if one wants to use custom offset management, any of the Kafka
> monitoring tools cannot see the offsets - they would need to use same
> custom client implementation which is practically not possible.
>
> Kind regards,
> Stevo Slavic.
>