You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Piotr Nowojski (Jira)" <ji...@apache.org> on 2019/11/15 09:22:00 UTC
[jira] [Commented] (FLINK-14712) Improve back-pressure reporting
mechanism
[ https://issues.apache.org/jira/browse/FLINK-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974955#comment-16974955 ]
Piotr Nowojski commented on FLINK-14712:
----------------------------------------
Thanks! Would you like to be assigned to all of the tickets?
> Improve back-pressure reporting mechanism
> -----------------------------------------
>
> Key: FLINK-14712
> URL: https://issues.apache.org/jira/browse/FLINK-14712
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Metrics, Runtime / Network, Runtime / REST
> Reporter: lining
> Assignee: lining
> Priority: Major
> Attachments: image-2019-11-12-14-30-16-130.png
>
>
> h4. (1) The current monitor is heavy-weight.
> * Backpressure monitoring works by repeatedly taking stack trace samples of your running tasks.
> h4. (2) It is difficult to find out which vertex is the source of backpressure.
> * User need to know current and upstream's network metric to judge current whether is the source of backpressure. Now user has to record relevant information.
> h3. Proposed Changes
> 1. expose the new mechanism implemented in FLINK-14472 as a "is back-pressured" metric.
> 2. show the vertex that produces the backpressure source for the job.
> 3. expose network pool usage in IOMetricsInfo:
> # if sub task is not back pressured, but it is causing a back pressure (full input, empty output)
> # by comparing exclusive/floating buffers usage, whether all channels are back-pressured or only some of them
> {code:java}
> public final class IOMetricsInfo {
> private final float outPoolUsage;
> private final float inputExclusiveBuffersUsage;
> private final float inputFloatingBuffersUsage;
> }
> {code}
> JobDetailsInfo.JobVertexDetailsInfo merge use Math.max.(ps: outPoolUsage is from upstream)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)