You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Aviem Zur (JIRA)" <ji...@apache.org> on 2017/05/24 17:55:04 UTC

[jira] [Created] (BEAM-2359) SparkTimerInternals inputWatermarkTime does not get updated in cluster mode

Aviem Zur created BEAM-2359:
-------------------------------

             Summary: SparkTimerInternals inputWatermarkTime does not get updated in cluster mode
                 Key: BEAM-2359
                 URL: https://issues.apache.org/jira/browse/BEAM-2359
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
            Reporter: Aviem Zur
            Assignee: Amit Sela


{{SparkTimerInternals#inputWatermarkTime}} does not get updated in cluster mode.

This causes windows to not get closed and state to increase forever in memory and processing time to increase leading to eventual application crash (also, triggers based on the watermark do not fire).

The root cause is 
a call from within the {{updateStateByKey}} operation in [SparkGroupAlsoByWindowViaWindowSet|https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L241-L242] which tries to access a static reference to a {{GlobalWatermarkHolder}} broadcast variable, however, in cluster mode this static reference would be a different one in the executor's JVM and is null (this works in local mode since the executor and driver are on the same JVM).

The fix is not trivial since even if we use the broadcast correctly, broadcast variables can't be used in this case (from within {{updateStateByKey}}) since  {{updateStateByKey}} is a {{DStream}} operator and not an {{RDD}} operator so it will not be updated every micro-batch but rather will retain the same initial value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)