You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Allen Wang <al...@gmail.com> on 2020/10/14 16:54:09 UTC

StatsD metric name prefix change for task manager after upgrading to Flink 1.11

Hello,

We noticed that after upgrading to Flink 1.11, the StatsD metric prefix is
changed from the hostname to IP address of the task manager.

The Flink job runs in a k8s cluster.

Here is an example of metric reported to StatsD in Flink 1.10:

flink-ingest-cx-home-page-feed-flink-task-manager-7f8c7677l85pl.taskmanager.16c2dbc84eb27f336455615e642c6cdd.flink-ingest-cx-home-page-feed.Source-
Custom Source.1.assigned-partitions:3.0|g

Here is an example of metric reported to StatsD in Flink 1.11:

10.4.155.205.taskmanager.0a900ab762d7d534ea8b20e84438166b.flink-ingest-xp-xp.Source-
Custom Source.0.assigned-partitions:3.0|g

This caused a problem for us as StatsD interprets the segment before
the first dot as the source. So after upgrading to 1.11, the task
manager metrics all have "10" as the source.

Is there any configuration to change this behavior back to the 1.10 version
where the prefix of the metric is the host name?

Thanks,
Allen

Re: StatsD metric name prefix change for task manager after upgrading to Flink 1.11

Posted by Nikola Hrusov <n....@gmail.com>.
Hi,

I have also observed the same when upgrading to flink 1.11 running in
docker and sending to graphite.
Prior to upgrading the taskmanagers would use the hostname. Since 1.11 they
report their IPs
Sadly I did not find any resolution to my issue:
https://lists.apache.org/thread.html/r620b18d12c08d13375a390f94e0cdff26462c6e26440b31236473793%40%3Cuser.flink.apache.org%3E

Regards
,
Nikola Hrusov


On Thu, Oct 15, 2020 at 3:49 PM Chesnay Schepler <ch...@apache.org> wrote:

> The TaskExecutor host being exposed is directly wired to what the RPC
> system for addresses, which may have changed due to (FLINK-15911; NAT
> support).
>
> If the problem is purely about the periods in the IP, then I would suggest
> to create a custom reporter that extends the StatsDReporter and overrides
> filterCharacters to also replace periods.
> This also reminds me of a suggestion we got in the past where we
> automatically replace occurrences of the delimiter; let me open an issue
> for that...
>
> On 10/14/2020 6:54 PM, Allen Wang wrote:
>
> Hello,
>
> We noticed that after upgrading to Flink 1.11, the StatsD metric prefix is
> changed from the hostname to IP address of the task manager.
>
> The Flink job runs in a k8s cluster.
>
> Here is an example of metric reported to StatsD in Flink 1.10:
>
> flink-ingest-cx-home-page-feed-flink-task-manager-7f8c7677l85pl.taskmanager.16c2dbc84eb27f336455615e642c6cdd.flink-ingest-cx-home-page-feed.Source- Custom Source.1.assigned-partitions:3.0|g
>
> Here is an example of metric reported to StatsD in Flink 1.11:
>
> 10.4.155.205.taskmanager.0a900ab762d7d534ea8b20e84438166b.flink-ingest-xp-xp.Source- Custom Source.0.assigned-partitions:3.0|g
>
> This caused a problem for us as StatsD interprets the segment before the first dot as the source. So after upgrading to 1.11, the task manager metrics all have "10" as the source.
>
> Is there any configuration to change this behavior back to the 1.10
> version where the prefix of the metric is the host name?
>
> Thanks,
> Allen
>
>
>

Re: StatsD metric name prefix change for task manager after upgrading to Flink 1.11

Posted by Chesnay Schepler <ch...@apache.org>.
The TaskExecutor host being exposed is directly wired to what the RPC 
system for addresses, which may have changed due to (FLINK-15911; NAT 
support).

If the problem is purely about the periods in the IP, then I would 
suggest to create a custom reporter that extends the StatsDReporter and 
overrides filterCharacters to also replace periods.
This also reminds me of a suggestion we got in the past where we 
automatically replace occurrences of the delimiter; let me open an issue 
for that...

On 10/14/2020 6:54 PM, Allen Wang wrote:
> Hello,
>
> We noticed that after upgrading to Flink 1.11, the StatsD metric 
> prefix is changed from the hostname to IP address of the task manager.
>
> The Flink job runs in a k8s cluster.
>
> Here is an example of metric reported to StatsD in Flink 1.10:
> flink-ingest-cx-home-page-feed-flink-task-manager-7f8c7677l85pl.taskmanager.16c2dbc84eb27f336455615e642c6cdd.flink-ingest-cx-home-page-feed.Source- Custom Source.1.assigned-partitions:3.0|g
> Here is an example of metric reported to StatsD in Flink 1.11:
> 10.4.155.205.taskmanager.0a900ab762d7d534ea8b20e84438166b.flink-ingest-xp-xp.Source- Custom Source.0.assigned-partitions:3.0|g
> This caused a problemfor us as StatsD interpretsthe segment before the first dot as the source. So after upgrading to 
> 1.11, the task manager metrics all have "10" as the source.
> Is there any configuration to change this behavior back to the 1.10 
> version where the prefix of the metric is the host name?
>
> Thanks,
> Allen
>