You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Antony Mayi (JIRA)" <ji...@apache.org> on 2017/06/01 16:19:04 UTC
[jira] [Created] (BEAM-2398) Increasing latency within DirectRunner
caused by cumulated TransformWatermarks
Antony Mayi created BEAM-2398:
---------------------------------
Summary: Increasing latency within DirectRunner caused by cumulated TransformWatermarks
Key: BEAM-2398
URL: https://issues.apache.org/jira/browse/BEAM-2398
Project: Beam
Issue Type: Bug
Components: runner-direct
Affects Versions: 2.0.0
Reporter: Antony Mayi
Assignee: Thomas Groh
Attachments: LatencyTest.java
Over the time the end-to-end latency of a pipeline running on DirectRunner is significantly increasing.
This is caused by ever growing sets of:
* {{WatermarkManager.TransformWatermarks.inputWatermark.pendingElements}}
* {{WatermarkManager.TransformWatermarks.synchronizedProcessingInputWatermark.pendingBundles}}
That means calls to {{WatermarkManager.TransformWatermarks.refresh()}} which need to iterate through that collections take longer and longer and the latency is growing.
I believe it is the line {{WaterMark.updatePending()}} line:
{quote}
if (input != null) {
// Add the unprocessed inputs
completedTransform.addPending(result.getUnprocessedInputs());
{quote}
that's adding the items that are never removed.
See attached demo code showing the increasing latency.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)