You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@flink.apache.org by ch...@apache.org on 2019/01/31 12:06:22 UTC

[flink] branch release-1.7 updated: [FLINK-11473][metrics][docs] Clarify documentation on Latency Tracking

This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch release-1.7
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/release-1.7 by this push:
     new e1e1016  [FLINK-11473][metrics][docs] Clarify documentation on Latency Tracking
e1e1016 is described below

commit e1e10163da39030064e4c5ad06c6c645345b619c
Author: Konstantin Knauf <kn...@gmail.com>
AuthorDate: Thu Jan 31 13:04:35 2019 +0100

    [FLINK-11473][metrics][docs] Clarify documentation on Latency Tracking
---
 docs/monitoring/metrics.md | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/monitoring/metrics.md b/docs/monitoring/metrics.md
index bfa0576..d9c8bc2 100644
--- a/docs/monitoring/metrics.md
+++ b/docs/monitoring/metrics.md
@@ -1627,16 +1627,18 @@ bypassing them. In particular the markers are not accounting for the time record
 Only if operators are not able to accept new records, thus they are queuing up, the latency measured using
 the markers will reflect that.
 
-All intermediate operators keep a list of the last `n` latencies from each source to compute 
-a latency distribution.
-The sink operators keep a list from each source, and each parallel source instance to allow detecting 
-latency issues caused by individual machines.
+The `LatencyMarker`s are used to derive a distribution of the latency between the sources of the topology and each 
+downstream operator. These distributions are reported as histogram metrics. The granularity of these distributions can 
+be controlled in the [Flink configuration]({{ site.baseurl }}/ops/config.html#metrics-latency-interval. For the highest 
+granularity `subtask` Flink will derive the latency distribution between every source subtask and every downstream 
+subtask, which results in quadratic (in the terms of the parallelism) number of histograms. 
 
 Currently, Flink assumes that the clocks of all machines in the cluster are in sync. We recommend setting
 up an automated clock synchronisation service (like NTP) to avoid false latency results.
 
 <span class="label label-danger">Warning</span> Enabling latency metrics can significantly impact the performance
-of the cluster. It is highly recommended to only use them for debugging purposes.
+of the cluster (in particular for `subtask` granularity). It is highly recommended to only use them for debugging 
+purposes.
 
 ## REST API integration