You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2017/12/21 15:21:00 UTC

[jira] [Commented] (HADOOP-13282) S3 blob etags to be made visible in S3A status/getFileChecksum() calls

    [ https://issues.apache.org/jira/browse/HADOOP-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16300160#comment-16300160 ] 

Hudson commented on HADOOP-13282:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13416 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13416/])
HADOOP-13282. S3 blob etags to be made visible in S3A (stevel: rev c8ff0cc304f07bf793192291e0611b2fb4bcc4e3)
* (add) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/store/EtagChecksum.java
* (add) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/store/package-info.java
* (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* (edit) hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AMiscOperations.java
* (add) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/store/TestEtagChecksum.java


> S3 blob etags to be made visible in S3A status/getFileChecksum() calls
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-13282
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13282
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>             Fix For: 3.1.0
>
>         Attachments: HADOOP-13282-001.patch, HADOOP-13282-002.patch, HADOOP-13282-003.patch, HADOOP-13282-004.patch
>
>
> If the etags of blobs were exported via {{getFileChecksum()}}, it'd be possible to probe for a blob being in sync with a local file. Distcp could use this to decide whether to skip a file or not.
> Now, there's a problem there: distcp needs source and dest filesystems to implement the same algorithm. It'd only work out the box if you were copying between S3 instances. There are also quirks with encryption and multipart: [s3 docs|http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html]. At the very least, it's something which could be used when indexing the FS, to check for changes later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org