You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Aviem Zur (JIRA)" <ji...@apache.org> on 2017/08/28 07:50:00 UTC

[jira] [Created] (BEAM-2812) Dropped windows counters / log prints no longer working

Aviem Zur created BEAM-2812:
-------------------------------

             Summary: Dropped windows counters / log prints no longer working
                 Key: BEAM-2812
                 URL: https://issues.apache.org/jira/browse/BEAM-2812
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
            Reporter: Aviem Zur
            Assignee: Amit Sela


In https://github.com/apache/beam/pull/2838 aggregators were removed from Spark runner, this caused regression around dropped windows counters and logs.

{{CounterCell}} instances are created ad hoc instead of using the {{Metrics}} class static factory methods: [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
Context of where the metrics are reported isn't taken into account, and since these counters are being passed to a lazily evaluated iterator [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223] the subsequent code which looks at the counters is always looking at these counters immediately after initialization, before they are populated, so these prints will never happen since the conditional statements do not check on the right counters [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].

Additionally, {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes a {{CounterCell}} as a parameter, which is a class for metrics implementation and should generally not be used elsewhere (this is also mentioned in its Javadoc), we should look into changing this method to use something else and perhaps make {{CounterCell}} and similar classes package private (And change runner code which uses these to be in the same package).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)