You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ufuk Celebi (JIRA)" <ji...@apache.org> on 2016/08/08 08:15:21 UTC

[jira] [Commented] (FLINK-4323) Checkpoint Coordinator Removes HA Checkpoints in Shutdown

    [ https://issues.apache.org/jira/browse/FLINK-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411460#comment-15411460 ] 

Ufuk Celebi commented on FLINK-4323:
------------------------------------

Should we still remove the shut down hook? The shut down hook would only hide problems with the {{ExecutionGraph}} cleanup. The {{CheckpointCoordinator}} should stay independent of the {{RecoveryMode}}.

> Checkpoint Coordinator Removes HA Checkpoints in Shutdown
> ---------------------------------------------------------
>
>                 Key: FLINK-4323
>                 URL: https://issues.apache.org/jira/browse/FLINK-4323
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.1.0
>            Reporter: Stephan Ewen
>            Priority: Blocker
>             Fix For: 1.2.0, 1.1.1
>
>
> The {{CheckpointCoordinator}} has a shutdown hook that "shuts down" the savepoint store, rather than suspending it.
> As a consequence, HA checkpoints may be lost when the JobManager process fails but allows the shutdown hook to run.
> I would suggest to remove the sutdown hook from the CheckpointCoordinator all together. The JobManager process is responsible for cleanups and can better decide what should be cleaned up and what not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)