You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2017/05/11 08:43:04 UTC

[jira] [Updated] (FLINK-6533) Duplicated registration of new shared state when checkpoint confirmations are still pending

     [ https://issues.apache.org/jira/browse/FLINK-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aljoscha Krettek updated FLINK-6533:
------------------------------------
    Fix Version/s: 1.3.0

> Duplicated registration of new shared state when checkpoint confirmations are still pending
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-6533
>                 URL: https://issues.apache.org/jira/browse/FLINK-6533
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>            Priority: Blocker
>             Fix For: 1.3.0
>
>
> Each incremental RocksDB checkpoint n is registering new and existing shared state with the {{SharedStateRegistry}} when it completes. Only then, the backend is notified and all following checkpoints (n+x) can reference the new state in checkpoint n.
> However, when a checkpoint n+1 is already starting before n was confirmed to the backend, n+1 can assume some files as new, which were already contained in n. It will upload the file to DFS again, creating a new state handle.
> Then, once n+1 completes, it could to register some state as new, which was previously registered already by n, without n+1 knowing of this. Currently this violates a precondition check, that the reference count for state that is assumed as new is 1.
> While we cannot prevent duplicate uploads, we must resolve this situation in the {{SharedStateREgistry}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)