You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Maximilian Michels (Jira)" <ji...@apache.org> on 2020/04/21 08:05:00 UTC

[jira] [Commented] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

    [ https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088401#comment-17088401 ] 

Maximilian Michels commented on BEAM-9794:
------------------------------------------

Good catch! I think we have to resort to only using a single state cell for buffering on checkpoints, instead of using a new one for every checkpoint. I was under the assumption that, if the state cell was cleared, it would not be checkpointed but that does not seem to be the case.

Actually, this should be easy to fix by using Flink's namespacing instead of creating a new state cell. We currently only use the VoidNamespace.

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-9794
>                 URL: https://issues.apache.org/jira/browse/BEAM-9794
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>            Reporter: David Morávek
>            Assignee: David Morávek
>            Priority: Major
>
> Full original report: https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)