You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "V N, Suchithra (Nokia - IN/Bangalore)" <su...@nokia.com> on 2021/03/05 14:31:16 UTC

Need information on latency metrics

Hi,

I am using flink 1.12.1 version and trying to explore latency metrics with Prometheus. I have enabled latency metrics by adding "metrics.latency.interval: 1" in flink-conf.yaml.
I have submitted a flink streaming job which has Source->flatmap->process->sink which is chained into single task. And I can see below latency metrics in Prometheus.

flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency

Prometheus output :
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count

flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace="", operator_id="3d05135cf7d8f1375d8f655ba9d20255", operator_subtask_index="0", pod_template_hash="5b58cdf557", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
27804583
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace=" ", operator_id="570f707193e0fe32f4d86d067aba243b", operator_subtask_index="0", pod_template_hash="5b58cdf557", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
27804583
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace=" ", operator_id="ba40499bacce995f15693b1735928377", operator_subtask_index="0", pod_template_hash="5b58cdf557", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
27804583

flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency

flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", operator_subtask_index="0", pod_template_hash="5b58cdf557", quantile="0.95", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
0
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", operator_subtask_index="0", pod_template_hash="5b58cdf557", quantile="0.98", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
0.4200000000000017
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", host="", instance=" ", job="kubernetes-pods-insecure", job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", operator_subtask_index="0", pod_template_hash="5b58cdf557", quantile="0.99", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}
1

Could someone please explain what is the values reported here for these metrics.
Thanks
Suchithra


RE: Need information on latency metrics

Posted by "V N, Suchithra (Nokia - IN/Bangalore)" <su...@nokia.com>.
Hi Timo,

Yes I have gone through the link. But for the other metrics documentation has description.
For example, 
numBytesOut -	The total number of bytes this task has emitted.
lastCheckpointSize - The total size of the last checkpoint (in bytes).

For the latency metrics I don't see such description due to which it is difficult to understand what is the count listed for each operator, how it is incrementing and values.
It will be helpful if some more information is provided regarding these metrics.

-----Original Message-----
From: Timo Walther <tw...@apache.org> 
Sent: Friday, March 5, 2021 8:52 PM
To: user@flink.apache.org
Subject: Re: Need information on latency metrics

Hi Suchithra,

did you see this section in the docs?

https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html#latency-tracking

Regards,
Timo

On 05.03.21 15:31, V N, Suchithra (Nokia - IN/Bangalore) wrote:
> Hi,
> 
> I am using flink 1.12.1 version and trying to explore latency metrics 
> with Prometheus. I have enabled latency metrics by adding
> *"metrics.latency.interval: 1" *in flink-conf.yaml.
> 
> I have submitted a flink streaming job which has
> Source->flatmap->process->sink which is chained into single task. And 
> Source->flatmap->process->I
> can see below latency metrics in Prometheus.
> 
> flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_i
> ndex_latency_count
> 
> flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_i
> ndex_latency
> 
> Prometheus output :
> 
> *flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency_count*
> 
> **
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency_count{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace="", operator_id="3d05135cf7d8f1375d8f655ba9d20255",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency_count{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace=" ", operator_id="570f707193e0fe32f4d86d067aba243b",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency_count{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace=" ", operator_id="ba40499bacce995f15693b1735928377",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> *//*
> 
> *flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.95", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" 
> "}/
> 
> */0/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.98", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" 
> "}/
> 
> */0.4200000000000017/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_
> index_latency{app="met-flink-taskmanager",
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc",
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255",
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.99", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" 
> "}/
> 
> */1/*
> 
> *//*
> 
> Could someone please explain what is the values reported here for 
> these metrics.
> 
> Thanks
> 
> Suchithra
> 


Re: Need information on latency metrics

Posted by Timo Walther <tw...@apache.org>.
Hi Suchithra,

did you see this section in the docs?

https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html#latency-tracking

Regards,
Timo

On 05.03.21 15:31, V N, Suchithra (Nokia - IN/Bangalore) wrote:
> Hi,
> 
> I am using flink 1.12.1 version and trying to explore latency metrics 
> with Prometheus. I have enabled latency metrics by adding 
> *“metrics.latency.interval: 1” *in flink-conf.yaml.
> 
> I have submitted a flink streaming job which has 
> Source->flatmap->process->sink which is chained into single task. And I 
> can see below latency metrics in Prometheus.
> 
> flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count
> 
> flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency
> 
> Prometheus output :
> 
> *flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count*
> 
> **
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace="”, operator_id="3d05135cf7d8f1375d8f655ba9d20255", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace=" ", operator_id="570f707193e0fe32f4d86d067aba243b", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency_count{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace=" ", operator_id="ba40499bacce995f15693b1735928377", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */27804583/*
> 
> *//*
> 
> *flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.95", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */0/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.98", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */0.4200000000000017/*
> 
> /flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency{app="met-flink-taskmanager", 
> host="", instance=" ", job="kubernetes-pods-insecure", 
> job_id="3ad0b4c814836aea92c48f6baf44b8bb", job_name=" ", 
> kubernetes_pod_name="met-flink-taskmanager-5b58cdf557-l24tc", 
> namespace=" ", operator_id="3d05135cf7d8f1375d8f655ba9d20255", 
> operator_subtask_index="0", pod_template_hash="5b58cdf557", 
> quantile="0.99", source_id="cbc357ccb763df2852fee8c4fc7d55f2", tm_id=" "}/
> 
> */1/*
> 
> *//*
> 
> Could someone please explain what is the values reported here for these 
> metrics.
> 
> Thanks
> 
> Suchithra
>