You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Bruno Cadonna (Jira)" <ji...@apache.org> on 2023/02/24 08:53:00 UTC

[jira] [Assigned] (KAFKA-10484) Reduce Metrics Exposed by Streams

     [ https://issues.apache.org/jira/browse/KAFKA-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Cadonna reassigned KAFKA-10484:
-------------------------------------

    Assignee:     (was: Bruno Cadonna)

> Reduce Metrics Exposed by Streams
> ---------------------------------
>
>                 Key: KAFKA-10484
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10484
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 2.6.0
>            Reporter: Bruno Cadonna
>            Priority: Major
>
> In our test cluster metrics are monitored through a monitoring service. We experienced a couple of times that a Kafka Streams client exceeded the limit of 350 metrics of the monitoring service. When the client exceeds the limit, metrics will be truncated which might result in false alerts. For example, in our cluster, we monitor the alive stream threads and trigger an alert if a stream thread dies. It happened that when the client exceeded the 350 metrics limit, the alive stream threads metric was truncated which led to a false alarm.
> The main driver of the high number of metrics are the metrics on task level and below. An example for those metrics are the state store metrics. The number of such metrics per Kafka Streams client is hard to predict since it depends on which tasks are assigned to the client. A stateful task with 5 state stores reports 5 times more state store metrics than a stateful with only one state store. Sometimes it is possible to only report the metrics of some state stores. But sometimes this is not an option. For example, if we want to monitor the memory usage of RocksDB per Kafka Streams client, we need to report the memory related metrics of all RocksDB state stores of all tasks assigned to all stream threads of one client.
> One option to reduce the reported metrics is to add a metric that aggregates some state store metrics, e.g., to monitor memory usage, on client-level within Kafka Streams.       



--
This message was sent by Atlassian Jira
(v8.20.10#820010)