You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Amitav Mohanty <am...@gmail.com> on 2019/01/16 21:12:21 UTC
Total Volume metrics of Kafka
Hi
I am interested in getting total volume of data that a topic ingested in a
period of time. Does Kafka collect any such metrics? I check JMX console
but I only found rate metrics.
Regards,
Amitav
Re: Total Volume metrics of Kafka
Posted by Gabriele Paggi <ga...@gmail.com>.
On Thu, 17 Jan 2019 at 00:44, Peter Bukowinski <pm...@gmail.com> wrote:
> On each broker, we have a process (scheduled with cron) that polls the
> kafka jmx api every 60 seconds. It sends the metrics data to graphite (
> https://graphiteapp.org). We have graphite configured as a data source
> for grafana (https://grafana.com) and use it to build various dashboards
> to present the metrics we’re interested in.
>
> There are various jmx-to-graphite tools available. We use one written in
> house, but this one looks like it’ll do the job:
> https://github.com/logzio/jmx2graphite
Hi Peter,
We use this reporter (https://github.com/damienclaveau/kafka-graphite),
which we add to the classpath and configure it in Kafka's server.properties:
kafka.metrics.reporters=com.criteo.kafka.KafkaGraphiteMetricsReporter
kafka.metrics.polling.interval.secs=60
kafka.graphite.metrics.reporter.enabled=true
kafka.graphite.metrics.host=carbon.service.consul
kafka.graphite.metrics.port=2003
kafka.graphite.metrics.group={{ grains['fqdn']|replace(".","_") }}
kafka.graphite.metrics.exclude.regex=(kafka.network.*|kafka.log.*|kafka.cluster.*(InSyncReplicasCount|ReplicasCount|UnderReplicated))
kafka.graphite.dimension.enabled.meanRate=false
kafka.graphite.dimension.enabled.rate1m=false
kafka.graphite.dimension.enabled.rate5m=false
kafka.graphite.dimension.enabled.rate15m=false
kafka.graphite.dimension.enabled.min=false
kafka.graphite.dimension.enabled.max=false
kafka.graphite.dimension.enabled.mean=false
kafka.graphite.dimension.enabled.sum=false
kafka.graphite.dimension.enabled.stddev=false
kafka.graphite.dimension.enabled.median=false
kafka.graphite.dimension.enabled.p75=false
kafka.graphite.dimension.enabled.p95=false
kafka.graphite.dimension.enabled.p98=false
kafka.graphite.dimension.enabled.p99=false
kafka.graphite.dimension.enabled.p999=false
That spares you from having to run a cronjob and a jmx bridge. It works
also with Kafka 1.x and 2.x
--
Gabriele
Re: Total Volume metrics of Kafka
Posted by Peter Bukowinski <pm...@gmail.com>.
On each broker, we have a process (scheduled with cron) that polls the kafka jmx api every 60 seconds. It sends the metrics data to graphite (https://graphiteapp.org). We have graphite configured as a data source for grafana (https://grafana.com) and use it to build various dashboards to present the metrics we’re interested in.
There are various jmx-to-graphite tools available. We use one written in house, but this one looks like it’ll do the job: https://github.com/logzio/jmx2graphite
> On Jan 16, 2019, at 2:15 PM, Amitav Mohanty <am...@gmail.com> wrote:
>
> Peter,
>
> Thanks for the inputs. I am interested in aggregate bytes published into a
> topic. The approach of metrics collector along with graphing tool seems
> appealing. I can volume ingested over arbitrary periods of time which is
> exactly what I am looking for. Can you please point to some metrics
> collector that I can use? Is it sort of a cron-job that notes the rate
> every minute or every 15 mins?
>
> Regards,
> Amitav
>
> On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski <pm...@gmail.com> wrote:
>
>> Amitav,
>>
>> When you say total volume, do you want a topic’s size on disk, taking into
>> account replication and retention, or do you want the aggregate bytes
>> published into a topic? If you have a metrics collector and a graphing tool
>> such as grafana, you can transform the rate metrics to a byte sum by
>> applying an integral function, but those will always grow and not take into
>> account deletion after the retention period.
>>
>> If you want metrics on how much space a topic occupies on disk, I’d
>> suggest using collectd and this plugin:
>> https://github.com/HubSpot/collectd-kafka-disk
>>
>> —
>> Peter
>>
>>> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com>
>> wrote:
>>>
>>> Hi
>>>
>>> I am interested in getting total volume of data that a topic ingested in
>> a
>>> period of time. Does Kafka collect any such metrics? I check JMX console
>>> but I only found rate metrics.
>>>
>>> Regards,
>>> Amitav
>>
>>
Re: Total Volume metrics of Kafka
Posted by Amitav Mohanty <am...@gmail.com>.
Peter,
Thanks for the inputs. I am interested in aggregate bytes published into a
topic. The approach of metrics collector along with graphing tool seems
appealing. I can volume ingested over arbitrary periods of time which is
exactly what I am looking for. Can you please point to some metrics
collector that I can use? Is it sort of a cron-job that notes the rate
every minute or every 15 mins?
Regards,
Amitav
On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski <pm...@gmail.com> wrote:
> Amitav,
>
> When you say total volume, do you want a topic’s size on disk, taking into
> account replication and retention, or do you want the aggregate bytes
> published into a topic? If you have a metrics collector and a graphing tool
> such as grafana, you can transform the rate metrics to a byte sum by
> applying an integral function, but those will always grow and not take into
> account deletion after the retention period.
>
> If you want metrics on how much space a topic occupies on disk, I’d
> suggest using collectd and this plugin:
> https://github.com/HubSpot/collectd-kafka-disk
>
> —
> Peter
>
> > On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com>
> wrote:
> >
> > Hi
> >
> > I am interested in getting total volume of data that a topic ingested in
> a
> > period of time. Does Kafka collect any such metrics? I check JMX console
> > but I only found rate metrics.
> >
> > Regards,
> > Amitav
>
>
Re: Total Volume metrics of Kafka
Posted by Peter Bukowinski <pm...@gmail.com>.
Amitav,
When you say total volume, do you want a topic’s size on disk, taking into account replication and retention, or do you want the aggregate bytes published into a topic? If you have a metrics collector and a graphing tool such as grafana, you can transform the rate metrics to a byte sum by applying an integral function, but those will always grow and not take into account deletion after the retention period.
If you want metrics on how much space a topic occupies on disk, I’d suggest using collectd and this plugin: https://github.com/HubSpot/collectd-kafka-disk
—
Peter
> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty <am...@gmail.com> wrote:
>
> Hi
>
> I am interested in getting total volume of data that a topic ingested in a
> period of time. Does Kafka collect any such metrics? I check JMX console
> but I only found rate metrics.
>
> Regards,
> Amitav