You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chris Dail (JIRA)" <ji...@apache.org> on 2017/06/13 17:20:00 UTC
[jira] [Updated] (FLINK-6911) StatsD Metrics name should escape
spaces
[ https://issues.apache.org/jira/browse/FLINK-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Dail updated FLINK-6911:
------------------------------
Description:
The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:
https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
It should also be noted that the FlinkStatsDReporter was based on the ReadyTalk StatsD implementation (this is indicated in the comment). Note that the ReadyTalk implementation does replace whitespace:
https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129
Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.
Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:
{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}
The actual name is truncated after the space.
was:
The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:
https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.
Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:
{noformat}
taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
{noformat}
The actual name is truncated after the space.
> StatsD Metrics name should escape spaces
> -----------------------------------------
>
> Key: FLINK-6911
> URL: https://issues.apache.org/jira/browse/FLINK-6911
> Project: Flink
> Issue Type: Improvement
> Components: Metrics
> Affects Versions: 1.3.0
> Environment: StatsD Metrics with Telegraf server
> Reporter: Chris Dail
>
> The StatsDReporter does not escape spaces in the metric name. It is generally accepted that spaces in the metric name are a bad idea:
> https://stackoverflow.com/questions/29674488/whitespace-in-statsd-metric-name
> It should also be noted that the FlinkStatsDReporter was based on the ReadyTalk StatsD implementation (this is indicated in the comment). Note that the ReadyTalk implementation does replace whitespace:
> https://github.com/ReadyTalk/metrics-statsd/blob/master/metrics-statsd-common/src/main/java/com/readytalk/metrics/StatsD.java#L129
> Specifically, I am integrating with Telegraf. It actually splits the name on spaces and treats these as (name, value, timestamp). It ignores everything except the name.
> https://github.com/influxdata/telegraf/blob/master/plugins/parsers/graphite/parser.go#L225
> Initially I found this issue when I had a space in the job name. Flink encodes the job name into the metrics as is. So when I put these into telegraf, all of the job level metrics ended up with the same bucket in telegraf.
> Flink also uses things like "Sink- <name>" and "Source- <name>" to encode source/sink. These also do not work with telegraf. I end up with metrics that look like this inside telegraf:
> {noformat}
> taskmanager_5e453417d87c755da6311b1940cc602f_TurbineHeatProcessor_examples_turbineHeatTest_Sink-
> {noformat}
> The actual name is truncated after the space.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)