You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/02/24 11:56:44 UTC

[jira] [Commented] (FLINK-5897) Untie Checkpoint Externalization from FileSystems

    [ https://issues.apache.org/jira/browse/FLINK-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882526#comment-15882526 ] 

ASF GitHub Bot commented on FLINK-5897:
---------------------------------------

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/3411

    [FLINK-5897] &  [FLINK-5822]  First step towards Generic State Backends and Global State Cleanup Hooks

    **This is the first part of a larger parent issue: Self-contained externalized checkpoints and global cleanup hooks.**
    
    Parts of the changes may seem incomplete, because they are preparation for later changes. To avoid too large pull requests, this is the first part that by itself is stable and compatible with the prior behavior.
    
    ## High-level changes
    
      1. The Checkpoint Coordinator knows about the base state backend. That is the first step towards generic storage of checkpoints (not file system specific) and global cleanup hooks (rather than tracking for example each file cleanup individually).
      
      2. The CompletedCheckpoint is not assuming that it is stored on a FileSystem, but holds now a `StreamStateHandle` to its metadata (if externalized) and a generic external pointer. In the case of a checkpoint on a FileSystem, this pointer is the file path.
    
    
    
    ## Detailed Changed
    
      - This moves the logic to load a statebackend from the configuration out of the `StreamTask` and into the `AbstractStateBackend`, because both JobManager and TaskManager now share the same logic.
      
      - Adds tests for the loading behavior of state backends
      
      - Improves the Exception signatures of state backend loading
      
      - Allows CompletedCheckpointStores to specify whether they require externalized checkpoints.
        That is important for the next step, where the ZooKeeper store only stores pointers and does not externalize the metadata an additional time.
    
      - `CompletedCheckpoint` holds pointer to metadata
      
      - CheckpointCoordinator externalizes the checkpoint explicitly (rather than the pending checkpoint does it implicitly).
      
      - More comments and JavaDocs
      
    ## Tests
    
    Most of the functionality just made some parts more generic and needs no additional tests.
    
    Additional tests were added for the passing of state backend from program to checkpoint coordinator, and for the loading of the state backend from the configurations.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink statebackend

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3411.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3411
    
----
commit 5689aba6f2ae0363e7e36ef5d920fdae88d5b5cc
Author: Stephan Ewen <se...@apache.org>
Date:   2017-02-17T16:51:00Z

    [FLINK-5822] [state backends] Make JobManager / Checkpoint Coordinator aware of the root state backend

commit 16c1f5afaabd0cff58afe5086ae0aabc82441072
Author: Stephan Ewen <se...@apache.org>
Date:   2017-02-22T21:18:50Z

    [FLINK-5897] [checkpoints] Make checkpoint externalization not depend strictly on FileSystems
    
    That is the first step towards checkpoints that can be externalized to other stores as well,
    like k/v stores and databases, if supported by the state backend.

----


> Untie Checkpoint Externalization from FileSystems
> -------------------------------------------------
>
>                 Key: FLINK-5897
>                 URL: https://issues.apache.org/jira/browse/FLINK-5897
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 1.3.0
>
>
> Currently, externalizing checkpoint metadata and storing savepoints depends strictly on FileSystems.
> Since state backends are more general, storing and cleaning up checkpoints with state backend hooks requires to untie savepoints and externalized checkpoints from filesystems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)