You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Simon Frei (Jira)" <ji...@apache.org> on 2022/01/31 16:31:00 UTC

[jira] [Commented] (FLINK-7935) Metrics with user supplied scope variables

    [ https://issues.apache.org/jira/browse/FLINK-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484770#comment-17484770 ] 

Simon Frei commented on FLINK-7935:
-----------------------------------

I am having the same issue with the discrepancy between how flink reports metrics to Datadog and how DD handles them.
Looking at the code it would be straight forward to drop any `key.value` sections in the metrics identifier if there's a corresponding `key:value` tag/variable. I'd expect that to be the expected behaviour by anyone using Datadog, but if deemed necessary one could add a config option like {{metrics.reporter.dghttp.removeTagsFromIdentifier}} to control if this is done or not.

Would such a contribution be welcome?

> Metrics with user supplied scope variables
> ------------------------------------------
>
>                 Key: FLINK-7935
>                 URL: https://issues.apache.org/jira/browse/FLINK-7935
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Metrics
>    Affects Versions: 1.3.2
>            Reporter: Elias Levy
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> We use DataDog for metrics.  DD and Flink differ somewhat in how they track metrics.
> Flink names and scopes metrics together, at least by default. E.g. by default  the System scope for operator metrics is {{<host>.taskmanager.<tm_id>.<job_name>.<operator_name>.<subtask_index>}}.  The scope variables become part of the metric's full name.
> In DD the metric would be named something generic, e.g. {{taskmanager.job.operator}}, and they would be distinguished by their tag values, e.g. {{tm_id=foo}}, {{job_name=var}}, {{operator_name=baz}}.
> Flink allows you to configure the format string for system scopes, so it is possible to set the operator scope format to {{taskmanager.job.operator}}.  We do this for all scopes:
> {code}
> metrics.scope.jm: jobmanager
> metrics.scope.jm.job: jobmanager.job
> metrics.scope.tm: taskmanager
> metrics.scope.tm.job: taskmanager.job
> metrics.scope.task: taskmanager.job.task
> metrics.scope.operator: taskmanager.job.operator
> {code}
> This seems to work.  The DataDog Flink metric's plugin submits all scope variables as tags, even if they are not used within the scope format.  And it appears internally this does not lead to metrics conflicting with each other.
> We would like to extend this to user defined metrics, but you can define variables/scopes when adding a metric group or metric with the user API, so that in DD we have a single metric with a tag with many different values, rather than hundreds of metrics to just the one value we want to measure across different event types.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)