You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chris Dail (JIRA)" <ji...@apache.org> on 2017/06/13 17:20:00 UTC

[jira] [Updated] (FLINK-6911) StatsD Metrics name should escape spaces

     [ https://issues.apache.org/jira/browse/FLINK-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Dail updated FLINK-6911:
------------------------------
    Description: 
The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:

https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name

It should also be noted that the FlinkStatsDReporter was based on the ReadyTalk StatsD implementation (this is indicated in the comment). Note that the ReadyTalk implementation does replace whitespace:
https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129

Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225

Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.

Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:

{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}

The actual name is truncated after the space.

  was:
The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:

https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name

Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225

Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.

Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:

{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}

The actual name is truncated after the space.


> StatsD Metrics name should escape spaces 
> -----------------------------------------
>
>                 Key: FLINK-6911
>                 URL: https://issues.apache.org/jira/browse/FLINK-6911
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics
>    Affects Versions: 1.3.0
>         Environment: StatsD Metrics with Telegraf server
>            Reporter: Chris Dail
>
> The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:
> https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
> It should also be noted that the FlinkStatsDReporter was based on the ReadyTalk StatsD implementation (this is indicated in the comment). Note that the ReadyTalk implementation does replace whitespace:
> https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129
> Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
> https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
> Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.
> Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:
> {noformat}
> taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
> {noformat}
> The actual name is truncated after the space.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)