You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jonathan Davey <jo...@gmail.com> on 2014/01/02 22:15:02 UTC

Measuring remaining data

I need to provide a report that shows the total amount of unread data
available to a consumer group for a particular set of topics.

Right now I do this by adding custom offset tracking logic to the
producers and consumers. This works but I think I can do better with
one of the following ideas:

* Running something alongside each broker that grabs the data from the
log dir and zookeeper
* Extending Kafka to do the same as above but expose it as a request
through the protocol

If I wanted to do these things, where would be a good place to start looking?

Re: Measuring remaining data

Posted by Guozhang Wang <wa...@gmail.com>.
Since 0.8 we have migrated from physical offset to logical offset, and that
is why we can no longer trace bytes in size. KAFKA-1197 is filed for part
of this requirement, does it satisfy what you need?

As for KAFKA-656, it is not abandoned but we are still trying to figure out
the correct way of doing so.

Guozhang


On Thu, Jan 2, 2014 at 2:26 PM, Jonathan Davey <jo...@gmail.com> wrote:

> That's certainly a step in the right direction. The thing is, I'd
> ideally like to use bytes rather than just messages and that data no
> longer seems to be available in Kafka 0.8. KAFKA-656 might give me
> what I need but that seems abandoned.
>
> On 2 January 2014 21:26, Guozhang Wang <wa...@gmail.com> wrote:
> > Hi Jonathan,
> >
> > Have you checked out the ConsumerOffsetChecker tool?
> >
> > Guozhang
> >
> >
> > On Thu, Jan 2, 2014 at 1:15 PM, Jonathan Davey <jo...@gmail.com> wrote:
> >
> >> I need to provide a report that shows the total amount of unread data
> >> available to a consumer group for a particular set of topics.
> >>
> >> Right now I do this by adding custom offset tracking logic to the
> >> producers and consumers. This works but I think I can do better with
> >> one of the following ideas:
> >>
> >> * Running something alongside each broker that grabs the data from the
> >> log dir and zookeeper
> >> * Extending Kafka to do the same as above but expose it as a request
> >> through the protocol
> >>
> >> If I wanted to do these things, where would be a good place to start
> >> looking?
> >>
> >
> >
> >
> > --
> > -- Guozhang
>



-- 
-- Guozhang

Re: Measuring remaining data

Posted by Jonathan Davey <jo...@gmail.com>.
That's certainly a step in the right direction. The thing is, I'd
ideally like to use bytes rather than just messages and that data no
longer seems to be available in Kafka 0.8. KAFKA-656 might give me
what I need but that seems abandoned.

On 2 January 2014 21:26, Guozhang Wang <wa...@gmail.com> wrote:
> Hi Jonathan,
>
> Have you checked out the ConsumerOffsetChecker tool?
>
> Guozhang
>
>
> On Thu, Jan 2, 2014 at 1:15 PM, Jonathan Davey <jo...@gmail.com> wrote:
>
>> I need to provide a report that shows the total amount of unread data
>> available to a consumer group for a particular set of topics.
>>
>> Right now I do this by adding custom offset tracking logic to the
>> producers and consumers. This works but I think I can do better with
>> one of the following ideas:
>>
>> * Running something alongside each broker that grabs the data from the
>> log dir and zookeeper
>> * Extending Kafka to do the same as above but expose it as a request
>> through the protocol
>>
>> If I wanted to do these things, where would be a good place to start
>> looking?
>>
>
>
>
> --
> -- Guozhang

Re: Measuring remaining data

Posted by Guozhang Wang <wa...@gmail.com>.
Hi Jonathan,

Have you checked out the ConsumerOffsetChecker tool?

Guozhang


On Thu, Jan 2, 2014 at 1:15 PM, Jonathan Davey <jo...@gmail.com> wrote:

> I need to provide a report that shows the total amount of unread data
> available to a consumer group for a particular set of topics.
>
> Right now I do this by adding custom offset tracking logic to the
> producers and consumers. This works but I think I can do better with
> one of the following ideas:
>
> * Running something alongside each broker that grabs the data from the
> log dir and zookeeper
> * Extending Kafka to do the same as above but expose it as a request
> through the protocol
>
> If I wanted to do these things, where would be a good place to start
> looking?
>



-- 
-- Guozhang