You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/12/12 12:33:00 UTC

[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK

    [ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287534#comment-16287534 ] 

Steve Loughran commented on HADOOP-13887:
-----------------------------------------

I've been thinking about this.

# Once a file is opened, it's length is known (the initial getFileStatus() returns it, and it will come back on a header of the GET
# many of the uses of a file don't need to know the full length of a file until it's open. Specifically, when your code does an open(); seek(EOF-len(footer)); you don't need to know the EOF in advance. Partitioning does, though there a small diff in the length of the last partition is *probably* tractable.

In C you can open a file, do an explicit seek(offset from EOF), and, if you want to know the file length, do an ftell() once you are there.

# We could add a in interface+ method + streamCapabilities() option to return the length of an open file, e.g. {{public abstract long size() throws IOException;}}. (you can get this from a raw local stream, BTW).
# then code could be moved to using it, starting with the internal classes

> Encrypt S3A data client-side with AWS SDK
> -----------------------------------------
>
>                 Key: HADOOP-13887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13887
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Jeeyoung Kim
>            Assignee: Igor Mazur
>            Priority: Minor
>         Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf
>
>
> Expose the client-side encryption option documented in Amazon S3 documentation  - http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html
> Currently this is not exposed in Hadoop but it is exposed as an option in AWS Java SDK, which Hadoop currently includes. It should be trivial to propagate this as a parameter passed to the S3client used in S3AFileSystem.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org