You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by gaurav kulkarni <ku...@yahoo.com> on 2021/08/26 02:22:20 UTC

Identify metrics belonging to the "same" task manager in kubernetes

Hi, 
We have multiple flink clusters running in kubernetes. We plan to enable prometheus on these clusters. Looks like flink metrics emitted are of the format:
"flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 65.0"

1.  Since task managers can get moved across multiple pods, is there is a way to uniquely identify metrics coming from the "same" task manager but moved to different pods? 
2. Is it possible to add formatting to the names of the metrics (for example, add any suffix/prefix to the names). 
Appreciate your help. 
Thanks,Gaurav

Re: Identify metrics belonging to the "same" task manager in kubernetes

Posted by gaurav kulkarni <ku...@yahoo.com>.
 Hi, 
I have another question: What mechanisms are usually used to correlate prometheus flink metrics for kubernetes? 
Thanks,Gaurav 
    On Thursday, August 26, 2021, 10:06:30 AM PDT, gaurav kulkarni <ku...@yahoo.com> wrote:  
 
  Thanks for the response!
For #2, custom labels should work too for our case. 
Thanks,Gaurav

    On Thursday, August 26, 2021, 08:28:27 AM PDT, Chesnay Schepler <ch...@apache.org> wrote:  
 
  1) As is there is no way to accomplish this.
  
  2) Yes (datadog, graphite) but if you are happy with Prometheus I instead recommend to fork the reporter and adjust it accordingly. Would only custom prefixes/suffixes work for your use-case, or would a custom label also work?
  
  3) Sure, there's plenty of other metrics:https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#io 
  
  On 26/08/2021 16:46, gaurav kulkarni wrote:
  
 
 Thanks for the response!  
  1) That's correct. I was wondering if there is any logical ID for task manager that stays the same even if its moved to a different pod and is a part of metrics emitted. The scenario is if a TM#1 was running on pod#1 and then is moved to pod#2 and if we can correlate the metrics. Please let me know if there is a way correlate metrics.  
  2) Are there any other reporters where it is possible?    3) I have one more question. Looks like most of the metrics emitted are around resource usage (memory/cpu). Does flink emit any metrics w.r.t job (for example, records processed, processing rate, overall latency etc)? 
  Thanks, Gaurav     On Thursday, August 26, 2021, 04:02:10 AM PDT, Chesnay Schepler <ch...@apache.org> wrote:  
  
     1) Can you clarify what you mean with "same"? Are you referring to something like an index, i.e., "This is TaskManager #2", without being tied to a specific process?
  
  2) It is not possible for the PrometheusReporter.
  
   On 26/08/2021 04:22, gaurav kulkarni wrote:
  
 
      Hi,  
  We have multiple flink clusters running in kubernetes. We plan to enable prometheus on these clusters. Looks like flink metrics emitted are of the format: 
  "flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 65.0"  
  1.  Since task managers can get moved across multiple pods, is there is a way to uniquely identify metrics coming from the "same" task manager but moved to different pods?  
  2. Is it possible to add formatting to the names of the metrics (for example, add any suffix/prefix to the names).  
  Appreciate your help.  
  Thanks, Gaurav  

 
       

 
     

Re: Identify metrics belonging to the "same" task manager in kubernetes

Posted by gaurav kulkarni <ku...@yahoo.com>.
 Thanks for the response!
For #2, custom labels should work too for our case. 
Thanks,Gaurav

    On Thursday, August 26, 2021, 08:28:27 AM PDT, Chesnay Schepler <ch...@apache.org> wrote:  
 
  1) As is there is no way to accomplish this.
  
  2) Yes (datadog, graphite) but if you are happy with Prometheus I instead recommend to fork the reporter and adjust it accordingly. Would only custom prefixes/suffixes work for your use-case, or would a custom label also work?
  
  3) Sure, there's plenty of other metrics:https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#io 
  
  On 26/08/2021 16:46, gaurav kulkarni wrote:
  
 
 Thanks for the response!  
  1) That's correct. I was wondering if there is any logical ID for task manager that stays the same even if its moved to a different pod and is a part of metrics emitted. The scenario is if a TM#1 was running on pod#1 and then is moved to pod#2 and if we can correlate the metrics. Please let me know if there is a way correlate metrics.  
  2) Are there any other reporters where it is possible?    3) I have one more question. Looks like most of the metrics emitted are around resource usage (memory/cpu). Does flink emit any metrics w.r.t job (for example, records processed, processing rate, overall latency etc)? 
  Thanks, Gaurav     On Thursday, August 26, 2021, 04:02:10 AM PDT, Chesnay Schepler <ch...@apache.org> wrote:  
  
     1) Can you clarify what you mean with "same"? Are you referring to something like an index, i.e., "This is TaskManager #2", without being tied to a specific process?
  
  2) It is not possible for the PrometheusReporter.
  
   On 26/08/2021 04:22, gaurav kulkarni wrote:
  
 
      Hi,  
  We have multiple flink clusters running in kubernetes. We plan to enable prometheus on these clusters. Looks like flink metrics emitted are of the format: 
  "flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 65.0"  
  1.  Since task managers can get moved across multiple pods, is there is a way to uniquely identify metrics coming from the "same" task manager but moved to different pods?  
  2. Is it possible to add formatting to the names of the metrics (for example, add any suffix/prefix to the names).  
  Appreciate your help.  
  Thanks, Gaurav  

 
       

 
   

Re: Identify metrics belonging to the "same" task manager in kubernetes

Posted by Chesnay Schepler <ch...@apache.org>.
1) As is there is no way to accomplish this.

2) Yes (datadog, graphite) but if you are happy with Prometheus I 
instead recommend to fork the reporter and adjust it accordingly. Would 
only custom prefixes/suffixes work for your use-case, or would a custom 
label also work?

3) Sure, there's plenty of other metrics: 
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#io 


On 26/08/2021 16:46, gaurav kulkarni wrote:
> Thanks for the response!
>
> 1) That's correct. I was wondering if there is any logical ID for task 
> manager that stays the same even if its moved to a different pod and 
> is a part of metrics emitted. The scenario is if a TM#1 was running on 
> pod#1 and then is moved to pod#2 and if we can correlate the metrics. 
> Please let me know if there is a way correlate metrics.
>
> 2) Are there any other reporters where it is possible?
> 3) I have one more question. Looks like most of the metrics emitted 
> are around resource usage (memory/cpu). Does flink emit any metrics 
> w.r.t job (for example, records processed, processing rate, overall 
> latency etc)?
>
> Thanks,
> Gaurav
> On Thursday, August 26, 2021, 04:02:10 AM PDT, Chesnay Schepler 
> <ch...@apache.org> wrote:
>
>
> 1) Can you clarify what you mean with "same"? Are you referring to 
> something like an index, i.e., "This is TaskManager #2", without being 
> tied to a specific process?
>
> 2) It is not possible for the PrometheusReporter.
>
> On 26/08/2021 04:22, gaurav kulkarni wrote:
> Hi,
>
> We have multiple flink clusters running in kubernetes. We plan to 
> enable prometheus on these clusters. Looks like flink metrics emitted 
> are of the format:
>
> "flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 
> 65.0"
>
> 1.  Since task managers can get moved across multiple pods, is there 
> is a way to uniquely identify metrics coming from the "same" task 
> manager but moved to different pods?
>
> 2. Is it possible to add formatting to the names of the metrics (for 
> example, add any suffix/prefix to the names).
>
> Appreciate your help.
>
> Thanks,
> Gaurav
>
>


Re: Identify metrics belonging to the "same" task manager in kubernetes

Posted by gaurav kulkarni <ku...@yahoo.com>.
 Thanks for the response! 
1) That's correct. I was wondering if there is any logical ID for task manager that stays the same even if its moved to a different pod and is a part of metrics emitted. The scenario is if a TM#1 was running on pod#1 and then is moved to pod#2 and if we can correlate the metrics. Please let me know if there is a way correlate metrics. 
2) Are there any other reporters where it is possible?  3) I have one more question. Looks like most of the metrics emitted are around resource usage (memory/cpu). Does flink emit any metrics w.r.t job (for example, records processed, processing rate, overall latency etc)?
Thanks,Gaurav    On Thursday, August 26, 2021, 04:02:10 AM PDT, Chesnay Schepler <ch...@apache.org> wrote:  
 
  1) Can you clarify what you mean with "same"? Are you referring to something like an index, i.e., "This is TaskManager #2", without being tied to a specific process?
  
  2) It is not possible for the PrometheusReporter.
  
  On 26/08/2021 04:22, gaurav kulkarni wrote:
  
 
 Hi,  
  We have multiple flink clusters running in kubernetes. We plan to enable prometheus on these clusters. Looks like flink metrics emitted are of the format: 
  "flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 65.0"  
  1.  Since task managers can get moved across multiple pods, is there is a way to uniquely identify metrics coming from the "same" task manager but moved to different pods?  
  2. Is it possible to add formatting to the names of the metrics (for example, add any suffix/prefix to the names).  
  Appreciate your help.  
  Thanks, Gaurav  

 
   

Re: Identify metrics belonging to the "same" task manager in kubernetes

Posted by Chesnay Schepler <ch...@apache.org>.
1) Can you clarify what you mean with "same"? Are you referring to 
something like an index, i.e., "This is TaskManager #2", without being 
tied to a specific process?

2) It is not possible for the PrometheusReporter.

On 26/08/2021 04:22, gaurav kulkarni wrote:
> Hi,
>
> We have multiple flink clusters running in kubernetes. We plan to 
> enable prometheus on these clusters. Looks like flink metrics emitted 
> are of the format:
>
> "flink_taskmanager_Status_JVM_GarbageCollector_G1_Young_Generation_Time{host="10_244_2_6",tm_id="10_244_2_6:6122_2e3d7a",} 
> 65.0"
>
> 1.  Since task managers can get moved across multiple pods, is there 
> is a way to uniquely identify metrics coming from the "same" task 
> manager but moved to different pods?
>
> 2. Is it possible to add formatting to the names of the metrics (for 
> example, add any suffix/prefix to the names).
>
> Appreciate your help.
>
> Thanks,
> Gaurav