You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2017/08/07 18:17:00 UTC

[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

    [ https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116972#comment-16116972 ] 

Aljoscha Krettek commented on BEAM-2140:
----------------------------------------

 [~jkff] [~kenn] [~lzljs3620320] Returning to this after a bit of a break. I did a simpler implementation that does not duplicate the {{ProcessFn}} code but instead blocks shutdown in the {{DoFnOperator}} while there are pending processing-time timers.

This solution works but is still flaky in tests (and in edge cases in the real world). The reason is still that processing-time timers don't hold back any watermark (neither the input nor the output watermark, if I'm correct). The situation is this (in the basic tests): we have a {{Create}}, a {{SDF}}, and a validating {{PAssert}}
{{code}}
Create -> SDF -> PAssert
{{code}}

In failure cases this happens: 1) {{Create}} emits some elements, 2) the {{SDF}} processes some elements, then yields and a processing-time timer is set for processing the remainder of the restriction, 3) {{Create}} finishes, the watermark goes to +Inf 4) the watermark "passes" (un-held) thought the SDF, 4) the watermark triggers computation at the {{PAssert}} and this notices that we didn't receive all expected data.

How does this work in the Dataflow runner if processing-time timers don't hold back the watermark. Or is there a custom implementation for SDF in the Dataflow runner?

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> -------------------------------------------------------
>
>                 Key: BEAM-2140
>                 URL: https://issues.apache.org/jira/browse/BEAM-2140
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)