You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/08/19 20:56:00 UTC

[jira] [Commented] (HADOOP-17855) S3A: Allow SSE configurations per object path

    [ https://issues.apache.org/jira/browse/HADOOP-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401858#comment-17401858 ] 

Steve Loughran commented on HADOOP-17855:
-----------------------------------------

Pretty reluctant to do this -at least on my personal development schedule.

* when we do things with directories, we often create markers in parent dirs. This complicates life as we'd have to choose which to use there too
* S3A Delegation tokens pass down all encryption settings so that you can submit work into a shared cluster where all encryption options including your secrets come with the job. This will need to be extended.
* all the usual stuff related to hierarchical references, duplicate conflicting entries et cetera et cetera. 
* would you support different SSE options (SSE-C vs SSE-KMS)? SSE-KMS is the only sensible option, really.

That said: these are all tractable and I can see the rationale for it. If you were to work on this I and others will do what we can to help nurture the change in.

I look forward to your submission, please follow the documented test process. (This probably complicates testing even more as you will need 2+ KMS keys. Docs will need to be updated...)

> S3A: Allow SSE configurations per object path
> ---------------------------------------------
>
>                 Key: HADOOP-17855
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17855
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Mike Dias
>            Priority: Major
>
> Currently, we can map the SSE configurations at bucket level only:
> {code:java}
> <property>
>   <name>fs.s3a.bucket.ireland-dev.server-side-encryption-algorithm</name>
>   <value>SSE-KMS</value>
> </property>
> <property>
>   <name>fs.s3a.bucket.ireland-dev.server-side-encryption.key</name>
>   <value>arn:aws:kms:eu-west-1:98067faff834c:key/071a86ff-8881-4ba0-9230-95af6d01ca01</value>
> </property>
> {code}
> But sometimes we want to encrypt data in different paths with different keys within the same bucket. For example, a partitioned table might benefit from encrypting each partition with a different key when the partition represents a customer or a country.
> [S3 already can encrypt using different keys/configurations at the object level|https://aws.amazon.com/premiumsupport/knowledge-center/s3-encrypt-specific-folder/], so what we need to do on Hadoop is to provide a way to map which key to use. One idea could be mapping them in the XML config:
>  
> {code:java}
> <property>
>   <name>fs.s3a.server-side-encryption.paths</name>
>   <value>s3://bucket/my_table/country=ireland,s3://bucket/my_table/country=uk, s3://bucket/my_table/country=germany</value>
> </property>
> <property>
>   <name>fs.s3a.server-side-encryption.path-keys</name>
>   <value>arn:aws:kms:eu-west-1:90ireland09:key/ireland-key,arn:aws:kms:eu-west-1:980uk0993c:key/uk-key,arn:aws:kms:eu-west-1:98germany089:key/germany-key</value>
> </property>
> {code}
> Or potentially fetch the mappings from the filesystem:
>  
> {code:java}
> <property>
>   <name>fs.s3a.server-side-encryption.mappings</name>
>   <value>s3://bucket/configs/encryption_mappings.json</value>
> </property> {code}
> where encryption_mappings.json could be something like this:
>  
> {code:java}
> { 
>    "path": "s3://bucket/customer_table/customerId=abc123", 
>    "algorithm": "SSE-KMS",
>    "key": "arn:aws:kms:eu-west-1:933993746:key/abc123-key"
> }
> ...
> { 
>    "path": "s3://bucket/customer_table/customerId=xyx987", 
>    "algorithm": "SSE-KMS",
>    "key": "arn:aws:kms:eu-west-1:933993746:key/xyx987-key"
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org