You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Amitav Mohanty <am...@gmail.com> on 2019/02/04 10:41:35 UTC

Re: metric for total volume of messages per topic over a period of time

Hi Stanislav,

Thanks for the suggestions.

I am looking for exactly what you suggested.

My final objective: get the volume of data ingested in a day (or any such
time frame).

My approach: collect the bytes in per sec value and run an aggregate (sum)
on the collected data to get the volume

Problems with this approach:
- I need to have a second process connecting over JMX every second

I am not able to use the following metric either.

kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=my_topic
FifteenMinuteRate

It is because the metric mentioned above is not a pure metric.

My proposal:
Introduce a metric that calculates total volume that came into a topic in a
given period of time (15 mins)

Let me know what you think.

Regards,
Amitav

On Fri, Jan 18, 2019 at 12:47 PM Stanislav Kozlovski <st...@confluent.io>
wrote:

> Hey there Amitav, thanks for the proposal.
>
> Just to be a bit more concrete on the proposed metrics, you would like to
> be able to see what the total bytes-in amount was for a given time period
> (e.g 60 seconds)?
> Do you mind sharing what the use case of such a metric is? This would
> require opening up a KIP and the more examples we can give to its
> usefulness the better the chance of it being accepted
>
> Best,
> Stanislav
>
> On Thu, Jan 17, 2019 at 12:46 PM Amitav Mohanty <amitavmohanty01@gmail.com
> >
> wrote:
>
> > Hi
> >
> > I have a small proposal for improvement of Kafka that I would like to
> > discuss with you. Kafka currently has rate metrics exposed per topic for
> > the number of bytes ingested. I am interested in a volume metric.
> >
> > I would like to have direct answer to the question how much data is
> > ingested in a topic over a period of time. Disk usage does not suffice
> for
> > the following reasons:
> > - Disk allocation will be done in chunks which will be filled by ingested
> > messages over time
> > - The disk can be cleared based on retention period but this metric can
> > potentially track volume ingestion over a period longer than the
> retention
> > period. For example, I might have an 8 hour retention period but I might
> > want to find out how much data was ingested on a topic in a day.
> >
> > Do you see any concerns over having such a metric?
> >
> > Regards,
> > Amitav
> >
>
>
> --
> Best,
> Stanislav
>