You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jeff Widman <je...@jeffwidman.com> on 2018/02/07 05:39:44 UTC

How to calculate consumer lag in wall-clock time by querying the broker?

I would like to monitor how far behind our consumer groups are using
wall-clock time in addition to the normal integer offset lag. This way
services that have tight latency SLAs can alert when a consumer falls
behind by N minutes.

Is there a way to do this by querying the cluster/brokers?

It's easy to get the highwater offset time, and I can fetch the consumer
offset as an integer, but I can't figure out how to derive the consumer
offset time.

It seems there is no way to directly fetch the offset as a time nor to
convert the integer offset to a time. Am I missing something, or is this
truly impossible from the broker-side?

We could do this by instrumenting all our consumers, but given how our
teams are structured, it'd be much simpler to monitor this by querying the
cluster. For example, if someone spins up a new consumer, we immediately
have this metric for their service.

Cheers,
Jeff



-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Re: How to calculate consumer lag in wall-clock time by querying the broker?

Posted by Puneet Lakhina <pu...@gmail.com>.
You could possibly fetch the message at the current consumer offset and
examine the timestamp of the message and compare it with the timestamp of
the high water mark. That's what I do today, so I'm also all ears if there
is a more obvious solution.

On Feb 6, 2018 9:40 PM, "Jeff Widman" <je...@jeffwidman.com> wrote:

> I would like to monitor how far behind our consumer groups are using
> wall-clock time in addition to the normal integer offset lag. This way
> services that have tight latency SLAs can alert when a consumer falls
> behind by N minutes.
>
> Is there a way to do this by querying the cluster/brokers?
>
> It's easy to get the highwater offset time, and I can fetch the consumer
> offset as an integer, but I can't figure out how to derive the consumer
> offset time.
>
> It seems there is no way to directly fetch the offset as a time nor to
> convert the integer offset to a time. Am I missing something, or is this
> truly impossible from the broker-side?
>
> We could do this by instrumenting all our consumers, but given how our
> teams are structured, it'd be much simpler to monitor this by querying the
> cluster. For example, if someone spins up a new consumer, we immediately
> have this metric for their service.
>
> Cheers,
> Jeff
>
>
>
> --
>
> *Jeff Widman*
> jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
> <><
>