You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kostas Kloudas (JIRA)" <ji...@apache.org> on 2019/01/17 10:39:00 UTC

[jira] [Comment Edited] (FLINK-11196) Extend S3 EntropyInjector to use key replacement (instead of key removal) when creating checkpoint metadata files

    [ https://issues.apache.org/jira/browse/FLINK-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744866#comment-16744866 ] 

Kostas Kloudas edited comment on FLINK-11196 at 1/17/19 10:38 AM:
------------------------------------------------------------------

Hi [~markcho]. Thanks for bringing this issue up. I agree with you that being able to reference externalized checkpoints deterministically is an important feature to have.
My concern is that the EntropyInjection mechanism seems to be the wrong place to add it, as it essentially adds no extra entropy to the paths.

I understand that this is the "shortest path" in terms of code to be added, but it breaks the separation of concerns.

That said, let's see what [~StephanEwen] has to say, as he is the one who worked on the entropy injection mechanism.



was (Author: kkl0u):
Hi [~markcho]. Thanks for bringing this issue up. I agree with you that this is an important feature to have and it should be added.
My concern is that the EntropyInjection mechanism seems to be the wrong place to add it, as it essentially adds no extra entropy to the paths.

I understand that this is the "shortest path" in terms of code to be added, but it breaks the separation of concerns.

That said, let's see what [~StephanEwen] has to say, as he is the one who worked on the entropy injection mechanism.


> Extend S3 EntropyInjector to use key replacement (instead of key removal) when creating checkpoint metadata files
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-11196
>                 URL: https://issues.apache.org/jira/browse/FLINK-11196
>             Project: Flink
>          Issue Type: Improvement
>          Components: FileSystem
>    Affects Versions: 1.7.0
>            Reporter: Mark Cho
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We currently use S3 entropy injection when writing out checkpoint data.
> We also use external checkpoints so that we can resume from a checkpoint metadata file later.
> The current implementation of S3 entropy injector makes it difficult to locate the checkpoint metadata files since in the newer versions of Flink, `state.checkpoints.dir` configuration controls where the metadata and state files are written, instead of having two separate paths (one for metadata, one for state files).
> With entropy injection, we replace the entropy marker in the path specified by `state.checkpoints.dir` with entropy (for state files) or we strip out the marker (for metadata files).
>  
> We need to extend the entropy injection so that we can replace the entropy marker with a predictable path (instead of removing it) so that we can do a prefix query for just the metadata files.
> By not using the entropy key replacement (defaults to empty string), you get the same behavior as it is today (entropy marker removed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)