You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yadong Xie (Jira)" <ji...@apache.org> on 2019/09/19 10:17:00 UTC

[jira] [Updated] (FLINK-14127) Better BackPressure Detection in WebUI

     [ https://issues.apache.org/jira/browse/FLINK-14127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yadong Xie updated FLINK-14127:
-------------------------------
    Description: 
According to the [Document|https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/back_pressure.html], the backpressure monitor only triggered on request and it is currently not available via metrics. This means that in the web UI we have no way to show all the backpressure state of all vertexes at the same time. The users need to click every vertex to get its backpressure state.

!屏幕快照 2019-09-19 下午6.00.05.png|width=510,height=197!

In Flink 1.9.0 and above, there are four metrics available(outPoolUsage, inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage), we can use these metrics to determine if there are possible backpressure, and then use the backpressure REST API to confirm it.

Here is a table get from [https://flink.apache.org/2019/07/23/flink-network-stack-2.html]

!屏幕快照 2019-09-19 下午6.00.57.png|width=516,height=304!

 

We can display the possible backpressure status on the vertex graph, thus users can get all the vertex backpressure states and locate the potential problem quickly.

 

!屏幕快照 2019-09-19 下午6.01.43.png|width=572,height=277!

 

REST API needed:

add outPoolUsage, inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage metrics for each vertex in the /jobs/:jobId API

  was:
According to the [Document|https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/back_pressure.html], the backpressure monitor only triggered on request and it is currently not available via metrics. This means that in the web UI we have no way to show all the backpressure state of all vertexes at the same time. The users need to click every vertex to get its backpressure state.

!屏幕快照 2019-09-19 下午6.00.05.png|width=510,height=197!

In Flink 1.9.0 and above, there are four metrics available(outPoolUsage, inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage), we can use these metrics to determine if there are possible backpressure, and then use the backpressure REST API to confirm it.

Here is a table get from [https://flink.apache.org/2019/07/23/flink-network-stack-2.html]

!屏幕快照 2019-09-19 下午6.00.57.png|width=516,height=304!

 

We can display the possible backpressure status on the vertex graph, thus users can get all the vertex backpressure states and locate the potential problem quickly.

 

!屏幕快照 2019-09-19 下午6.01.43.png|width=572,height=277!


> Better BackPressure Detection in WebUI
> --------------------------------------
>
>                 Key: FLINK-14127
>                 URL: https://issues.apache.org/jira/browse/FLINK-14127
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Web Frontend
>    Affects Versions: 1.10.0
>            Reporter: Yadong Xie
>            Priority: Major
>             Fix For: 1.10.0
>
>         Attachments: 屏幕快照 2019-09-19 下午6.00.05.png, 屏幕快照 2019-09-19 下午6.00.57.png, 屏幕快照 2019-09-19 下午6.01.43.png
>
>
> According to the [Document|https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/back_pressure.html], the backpressure monitor only triggered on request and it is currently not available via metrics. This means that in the web UI we have no way to show all the backpressure state of all vertexes at the same time. The users need to click every vertex to get its backpressure state.
> !屏幕快照 2019-09-19 下午6.00.05.png|width=510,height=197!
> In Flink 1.9.0 and above, there are four metrics available(outPoolUsage, inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage), we can use these metrics to determine if there are possible backpressure, and then use the backpressure REST API to confirm it.
> Here is a table get from [https://flink.apache.org/2019/07/23/flink-network-stack-2.html]
> !屏幕快照 2019-09-19 下午6.00.57.png|width=516,height=304!
>  
> We can display the possible backpressure status on the vertex graph, thus users can get all the vertex backpressure states and locate the potential problem quickly.
>  
> !屏幕快照 2019-09-19 下午6.01.43.png|width=572,height=277!
>  
> REST API needed:
> add outPoolUsage, inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage metrics for each vertex in the /jobs/:jobId API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)