You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (Jira)" <ji...@apache.org> on 2020/05/13 15:32:00 UTC

[jira] [Commented] (FLINK-17583) Allow option to store a savepoint's _metadata file separate from its data files

    [ https://issues.apache.org/jira/browse/FLINK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106402#comment-17106402 ] 

Stephan Ewen commented on FLINK-17583:
--------------------------------------

Thank you for digging into this and the thorough suggestion.

I need to think a bit about this - it also has some tricky implications with another ongoing effort to make savepoints (and non-incremental checkpoint) paths relative so that one can copy them around: [FLINK-5763]. The current design for that issue makes the assumption that all "exclusive" data is under the same parent path and can this be addressed relatively to the metadata location.


> Allow option to store a savepoint's _metadata file separate from its data files
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-17583
>                 URL: https://issues.apache.org/jira/browse/FLINK-17583
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.9.1
>            Reporter: Steve Bairos
>            Priority: Minor
>
> (In the description I mainly talk about savepoints, but the plan )
> We have a deployment framework that often needs to be able to return a list of valid savepoints in S3 with a certain prefix. Our assertion is that if an S3 object ends with '_metadata', then it is a valid savepoint. In order to generate the list of valid savepoints, we need to locate all of the _metadata files that start with a given prefix.
> For example, if our S3 bucket's paths look like this:
>  
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c165546-c326-43c0-9f47-f9a2cfd000ed
> ... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c757e5b-92b7-47b8-bfe8-cfe70eb28702
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/41297fd5-40df-4683-bfb6-534bfddae92a
> ... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/acbe839a-1ec7-4b41-9d87-595d557c2ac6
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/_metadata
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/2d2f5551-56a7-4fea-b25b-b0156660c650
> .... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/c8c410df-5fb0-46a0-84c5-43e1575e8dc5
> ... dozens of other savepoint dirs
> {code}
>  
> In order to get a list of all savepoints that my-job1 could possibly start with, we would want to get all the savepoints that start with the prefix:
> {code:java}
> s3://bucket/savepoints/my-job1 {code}
> Ideally, we would want to have the ability to get a list like this from S3:
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/_metadata
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/_metadata{code}
> Unfortunately there is no easy way to get this value because S3's API only allows you to search based on prefix and not postfix. Listing all objects with the prefix 's3://bucket/savepoints/my-job1' and then filtering the list to only include the files that contain _metadata will also not work because there are thousands of savepoint data files that have the same prefix such as:
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c165546-c326-43c0-9f47-f9a2cfd000ed
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c757e5b-92b7-47b8-bfe8-cfe70eb28702
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/acbe839a-1ec7-4b41-9d87-595d557c2ac6
> etc.{code}
>  
> I propose that we add a configuration in a similar vein to the S3 entropy injector which allows us to store the _metadata file in a separate path from the savepoint's data files. For example, with this hypothetical configuration:
> {code:java}
> state.checkpoints.split.key: _datasplit_
> state.checkpoints.split.metadata.dir: metadata
> state.checkpoints.split.data.dir: data{code}
> When a user triggers a savepoint with the path
> {code:java}
> s3://bucket/savepoints/_datasplit_/my-job1/2020-05-07/ {code}
> The resulting savepoint that is created looks like:
> {code:java}
> s3://bucket/savepoints/metadata/my-job1/2020-05-07/savepoint-654321-abcdef9876/_metadata
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/a50fc483-3581-4b55-a37e-b7c61b3ee47f
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/b0c6b7c0-6b94-43ae-8678-2f7640af1523
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/c1855b35-c0b7-4347-9352-88423998e5ec{code}
> Notice that the metadata's prefix is 
> {code:java}
>  s3://bucket/savepoints/metadata/my-job1/2020-05-07/{code}
> and the data files' prefix is
> {code:java}
>  s3://bucket/savepoints/data/my-job1/2020-05-07/{code}
> That way if I want to list all the savepoints for my-job1, I can just list all the objects with the prefix 
> {code:java}
>  aws s3 ls --recursive s3://bucket/savepoints/metadata/my-job1/{code}
> And I can get a clean list of just the _metadata files easily.
>  
> One alternative that we've thought about is using is the entropy injection. It technically does separate the _metadata file from the rest of the data as well but it kind of makes a mess of entropy dirs in S3 so it's not our ideal choice. 
>  
> I'm happy to take a shot at implementing the solution I suggested if it is an acceptable solution for Flink. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)