You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Tao Yang (JIRA)" <ji...@apache.org> on 2019/04/02 12:05:00 UTC

[jira] [Commented] (FLINK-8801) S3's eventual consistent read-after-write may fail yarn deployment of resources to S3

    [ https://issues.apache.org/jira/browse/FLINK-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807691#comment-16807691 ] 

Tao Yang commented on FLINK-8801:
---------------------------------

Hi, [~NicoK]. 

I have doubt about this PR which directly use modification time of local file as the timestamp of local resource in YARN then set it as the modification time for remote file via FileSystem#setTimes interface, but most file systems seem not implement the FileSystem#setTimes including s3/s3a/s3n/azure/aliyun-oss etc, so that inconsistency maybe exist between timestamp of local resource in YARN and modification time of remote file which can cause problems.

Could you please help to solve this doubt? Thanks!

> S3's eventual consistent read-after-write may fail yarn deployment of resources to S3
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-8801
>                 URL: https://issues.apache.org/jira/browse/FLINK-8801
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN, FileSystems, Runtime / Coordination
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>            Priority: Blocker
>             Fix For: 1.4.3, 1.5.0
>
>
> According to https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel:
> {quote}
> Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key name (to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write.
> {quote}
> Some S3 file system implementations may actually execute such a request for the about-to-write object and thus the read-after-write is only eventually consistent. {{org.apache.flink.yarn.Utils#setupLocalResource()}} currently relies on a consistent read-after-write since it accesses the remote resource to get file size and modification timestamp. Since there we have access to the local resource, we can use the data from there instead and circumvent the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)