You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Amitav Mohanty <am...@gmail.com> on 2019/01/16 21:12:21 UTC

Total Volume metrics of Kafka

Hi

I am interested in getting total volume of data that a topic ingested in a
period of time. Does Kafka collect any such metrics? I check JMX console
but I only found rate metrics.

Regards,
Amitav

Re: Total Volume metrics of Kafka

Posted by Gabriele Paggi <ga...@gmail.com>.
On Thu, 17 Jan 2019 at 00:44, Peter Bukowinski <pm...@gmail.com> wrote:

> On each broker, we have a process (scheduled with cron) that polls the
> kafka jmx api every 60 seconds. It sends the metrics data to graphite (
> https://graphiteapp.org). We have graphite configured as a data source
> for grafana (https://grafana.com) and use it to build various dashboards
> to present the metrics we’re interested in.
>
> There are various jmx-to-graphite tools available. We use one written in
> house, but this one looks like it’ll do the job:
> https://github.com/logzio/jmx2graphite


Hi Peter,

We use this reporter (https://github.com/damienclaveau/kafka-graphite),
which we add to the classpath and configure it in Kafka's server.properties:


kafka.metrics.reporters=com.criteo.kafka.KafkaGraphiteMetricsReporter
        kafka.metrics.polling.interval.secs=60
        kafka.graphite.metrics.reporter.enabled=true
        kafka.graphite.metrics.host=carbon.service.consul
        kafka.graphite.metrics.port=2003
        kafka.graphite.metrics.group={{ grains['fqdn']|replace(".","_") }}

kafka.graphite.metrics.exclude.regex=(kafka.network.*|kafka.log.*|kafka.cluster.*(InSyncReplicasCount|ReplicasCount|UnderReplicated))
        kafka.graphite.dimension.enabled.meanRate=false
        kafka.graphite.dimension.enabled.rate1m=false
        kafka.graphite.dimension.enabled.rate5m=false
        kafka.graphite.dimension.enabled.rate15m=false
        kafka.graphite.dimension.enabled.min=false
        kafka.graphite.dimension.enabled.max=false
        kafka.graphite.dimension.enabled.mean=false
        kafka.graphite.dimension.enabled.sum=false
        kafka.graphite.dimension.enabled.stddev=false
        kafka.graphite.dimension.enabled.median=false
        kafka.graphite.dimension.enabled.p75=false
        kafka.graphite.dimension.enabled.p95=false
        kafka.graphite.dimension.enabled.p98=false
        kafka.graphite.dimension.enabled.p99=false
        kafka.graphite.dimension.enabled.p999=false

That spares you from having to run a cronjob and a jmx bridge. It works
also with Kafka 1.x and 2.x

-- 
Gabriele

Re: Total Volume metrics of Kafka

Posted by Peter Bukowinski <pm...@gmail.com>.
On each broker, we have a process (scheduled with cron) that polls the kafka jmx api every 60 seconds. It sends the metrics data to graphite (https://graphiteapp.org). We have graphite configured as a data source for grafana (https://grafana.com) and use it to build various dashboards to present the metrics we’re interested in.

There are various jmx-to-graphite tools available. We use one written in house, but this one looks like it’ll do the job: https://github.com/logzio/jmx2graphite


> On Jan 16, 2019, at 2:15 PM, Amitav Mohanty <am...@gmail.com> wrote:
> 
> Peter,
> 
> Thanks for the inputs. I am interested in aggregate bytes published into a
> topic. The approach of metrics collector along with graphing tool seems
> appealing. I can volume ingested over arbitrary periods of time which is
> exactly what I am looking for. Can you please point to some metrics
> collector that I can use? Is it sort of a cron-job that notes the rate
> every minute or every 15 mins?
> 
> Regards,
> Amitav
> 
> On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski <pm...@gmail.com> wrote:
> 
>> Amitav,
>> 
>> When you say total volume, do you want a topic’s size on disk, taking into
>> account replication and retention, or do you want the aggregate bytes
>> published into a topic? If you have a metrics collector and a graphing tool
>> such as grafana, you can transform the rate metrics to a byte sum by
>> applying an integral function, but those will always grow and not take into
>> account deletion after the retention period.
>> 
>> If you want metrics on how much space a topic occupies on disk, I’d
>> suggest using collectd and this plugin:
>> https://github.com/HubSpot/collectd-kafka-disk
>> 
>> —
>> Peter
>> 
>>> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com>
>> wrote:
>>> 
>>> Hi
>>> 
>>> I am interested in getting total volume of data that a topic ingested in
>> a
>>> period of time. Does Kafka collect any such metrics? I check JMX console
>>> but I only found rate metrics.
>>> 
>>> Regards,
>>> Amitav
>> 
>> 


Re: Total Volume metrics of Kafka

Posted by Amitav Mohanty <am...@gmail.com>.
Peter,

Thanks for the inputs. I am interested in aggregate bytes published into a
topic. The approach of metrics collector along with graphing tool seems
appealing. I can volume ingested over arbitrary periods of time which is
exactly what I am looking for. Can you please point to some metrics
collector that I can use? Is it sort of a cron-job that notes the rate
every minute or every 15 mins?

Regards,
Amitav

On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski <pm...@gmail.com> wrote:

> Amitav,
>
> When you say total volume, do you want a topic’s size on disk, taking into
> account replication and retention, or do you want the aggregate bytes
> published into a topic? If you have a metrics collector and a graphing tool
> such as grafana, you can transform the rate metrics to a byte sum by
> applying an integral function, but those will always grow and not take into
> account deletion after the retention period.
>
> If you want metrics on how much space a topic occupies on disk, I’d
> suggest using collectd and this plugin:
> https://github.com/HubSpot/collectd-kafka-disk
>
> —
> Peter
>
> > On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com>
> wrote:
> >
> > Hi
> >
> > I am interested in getting total volume of data that a topic ingested in
> a
> > period of time. Does Kafka collect any such metrics? I check JMX console
> > but I only found rate metrics.
> >
> > Regards,
> > Amitav
>
>

Re: Total Volume metrics of Kafka

Posted by Peter Bukowinski <pm...@gmail.com>.
Amitav,

When you say total volume, do you want a topic’s size on disk, taking into account replication and retention, or do you want the aggregate bytes published into a topic? If you have a metrics collector and a graphing tool such as grafana, you can transform the rate metrics to a byte sum by applying an integral function, but those will always grow and not take into account deletion after the retention period.

If you want metrics on how much space a topic occupies on disk, I’d suggest using collectd and this plugin: https://github.com/HubSpot/collectd-kafka-disk

—
Peter

> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com> wrote:
> 
> Hi
> 
> I am interested in getting total volume of data that a topic ingested in a
> period of time. Does Kafka collect any such metrics? I check JMX console
> but I only found rate metrics.
> 
> Regards,
> Amitav