You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Andrew Wang <an...@cloudera.com> on 2014/08/08 20:45:48 UTC

[VOTE] Merge fs-encryption branch to trunk

Hi all,

I'd like to call a vote to merge the fs-encryption branch to trunk.
Development of this feature has been ongoing since March on HDFS-6134 and
HADOOP-10150, totally approximately 50 commits.

The fs-encryption branch introduces support for transparent, end-to-end
encryption within an "encryption zone". Each file stored within an
encryption zone is automatically encrypted and decrypted with a unique key.
These per-file keys are encrypted with an encryption key only accessible by
the client, ensuring that only the client is able to decrypt sensitive
data. Furthermore, there is support for native, hardware-accelerated AES
encryption. For further details, please see the design doc on HDFS-6134.

In terms of merge readiness, we've posted some successful consolidated
patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
recently been completed, allowing users to securely copy encrypted files
without first decrypting them. There is ongoing work to add support for
WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
posted a test plan, and has already identified a few issues that have been
fixed.

Design and development of this feature was also a cross-company effort with
many different contributors.

I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
was also instrumental, doing a lot of the design work and as well as
writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.

With that, here's my +1 to merge this to trunk.

Thanks,
Andrew

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by sanjay Radia <sa...@hortonworks.com>.

+1 (binding)
We have made some great progress in the last few days on some of the issues I raised.
I have posted a summary of the followup items that are needed on the Jira today.
I am +1ing expecting the team will  complete Items 1 (distcp/cp) and 2 (webhdfs)  promptly. Before we publish transparent encryption in a 2.x release for pubic consumption, let us at least complete item 1 (ie distcp and cp) and the flag to turn this feature on/of.

This is a great work; thanks team for contributing this important feature.

sanjay

On Aug 14, 2014, at 1:05 AM, sanjay Radia <sa...@hortonworks.com> wrote:

> While I was originally skeptical of transparent encryption, I like the value proposition of transparent encryption. HDFS has several layers, protocols  and tools. While the HDFS core part seems to be well done in the Jira, inserting the matching transparency in the other tools or protocols need to be worked through.
> 
> I have the following areas of concern:
> - Common protocols like webhdfs should continue to work (the design doc marks this as a goal), This issue is being discussed in the Jira but it appears that webhdfs does not currently work with encrypted files: Andrew say that "Regarding webhdfs, it's not a recommended deployment" and that he will modify the documentation to match that. Aljeandro say "Both httpfs and webhdfs will work just fine" but then in the same paragraph says "this could fail some security audits". We need to resolve this quickly. Webhdfs is heavily used by many Hadoop users.
> 
> 
> - Common tools should like cp, distcp and HAR should continue  to work with non-encrypted and encrypted files in an automatic fashion. This issue has been heavily discussed in the Jira and at the meeting. The /.reserved./.raw mechanism appears to be a step in the right direction for distcp and cp, however this work has not reached its conclusion in my opinion; Charles are I are going through the use cases and I think we are close to a clean solution for distcp and cp.  HAR still needs a concrete proposal.
> 
> - KMS scalability in medium to large clusters. This can perhaps  be addressed by getting the keys ahead of time when a job is submitted.  Without this the  KMS will need to be as highly available and scalable as the NN.  I think this is future implementation work but we need to at least determine if this is indeed possible in case we need to modify some of the APIs right now to support that.
> 
> There are some other minor things under discussion, and I still need to go through the new APIs.
> 
> Unfortunately at this stage I cannot give a +1 for this merge; I hope to change this in the next day or -  I am working with the Jira's team.  Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as possible.
> 
> Sanjay (binding)
> 
> 
> 
> On Aug 8, 2014, at 11:45 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
>> Hi all,
>> 
>> I'd like to call a vote to merge the fs-encryption branch to trunk.
>> Development of this feature has been ongoing since March on HDFS-6134 and
>> HADOOP-10150, totally approximately 50 commits.
>> 
>> .....
>> Thanks,
>> Andrew
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by sanjay Radia <sa...@hortonworks.com>.

+1 (binding)
We have made some great progress in the last few days on some of the issues I raised.
I have posted a summary of the followup items that are needed on the Jira today.
I am +1ing expecting the team will  complete Items 1 (distcp/cp) and 2 (webhdfs)  promptly. Before we publish transparent encryption in a 2.x release for pubic consumption, let us at least complete item 1 (ie distcp and cp) and the flag to turn this feature on/of.

This is a great work; thanks team for contributing this important feature.

sanjay

On Aug 14, 2014, at 1:05 AM, sanjay Radia <sa...@hortonworks.com> wrote:

> While I was originally skeptical of transparent encryption, I like the value proposition of transparent encryption. HDFS has several layers, protocols  and tools. While the HDFS core part seems to be well done in the Jira, inserting the matching transparency in the other tools or protocols need to be worked through.
> 
> I have the following areas of concern:
> - Common protocols like webhdfs should continue to work (the design doc marks this as a goal), This issue is being discussed in the Jira but it appears that webhdfs does not currently work with encrypted files: Andrew say that "Regarding webhdfs, it's not a recommended deployment" and that he will modify the documentation to match that. Aljeandro say "Both httpfs and webhdfs will work just fine" but then in the same paragraph says "this could fail some security audits". We need to resolve this quickly. Webhdfs is heavily used by many Hadoop users.
> 
> 
> - Common tools should like cp, distcp and HAR should continue  to work with non-encrypted and encrypted files in an automatic fashion. This issue has been heavily discussed in the Jira and at the meeting. The /.reserved./.raw mechanism appears to be a step in the right direction for distcp and cp, however this work has not reached its conclusion in my opinion; Charles are I are going through the use cases and I think we are close to a clean solution for distcp and cp.  HAR still needs a concrete proposal.
> 
> - KMS scalability in medium to large clusters. This can perhaps  be addressed by getting the keys ahead of time when a job is submitted.  Without this the  KMS will need to be as highly available and scalable as the NN.  I think this is future implementation work but we need to at least determine if this is indeed possible in case we need to modify some of the APIs right now to support that.
> 
> There are some other minor things under discussion, and I still need to go through the new APIs.
> 
> Unfortunately at this stage I cannot give a +1 for this merge; I hope to change this in the next day or -  I am working with the Jira's team.  Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as possible.
> 
> Sanjay (binding)
> 
> 
> 
> On Aug 8, 2014, at 11:45 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
>> Hi all,
>> 
>> I'd like to call a vote to merge the fs-encryption branch to trunk.
>> Development of this feature has been ongoing since March on HDFS-6134 and
>> HADOOP-10150, totally approximately 50 commits.
>> 
>> .....
>> Thanks,
>> Andrew
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by sanjay Radia <sa...@hortonworks.com>.
While I was originally skeptical of transparent encryption, I like the value proposition of transparent encryption. HDFS has several layers, protocols  and tools. While the HDFS core part seems to be well done in the Jira, inserting the matching transparency in the other tools or protocols need to be worked through.

I have the following areas of concern:
- Common protocols like webhdfs should continue to work (the design doc marks this as a goal), This issue is being discussed in the Jira but it appears that webhdfs does not currently work with encrypted files: Andrew say that "Regarding webhdfs, it's not a recommended deployment" and that he will modify the documentation to match that. Aljeandro say "Both httpfs and webhdfs will work just fine" but then in the same paragraph says "this could fail some security audits". We need to resolve this quickly. Webhdfs is heavily used by many Hadoop users.


- Common tools should like cp, distcp and HAR should continue  to work with non-encrypted and encrypted files in an automatic fashion. This issue has been heavily discussed in the Jira and at the meeting. The /.reserved./.raw mechanism appears to be a step in the right direction for distcp and cp, however this work has not reached its conclusion in my opinion; Charles are I are going through the use cases and I think we are close to a clean solution for distcp and cp.  HAR still needs a concrete proposal.

- KMS scalability in medium to large clusters. This can perhaps  be addressed by getting the keys ahead of time when a job is submitted.  Without this the  KMS will need to be as highly available and scalable as the NN.  I think this is future implementation work but we need to at least determine if this is indeed possible in case we need to modify some of the APIs right now to support that.

There are some other minor things under discussion, and I still need to go through the new APIs.

 Unfortunately at this stage I cannot give a +1 for this merge; I hope to change this in the next day or -  I am working with the Jira's team.  Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as possible.

Sanjay (binding)



On Aug 8, 2014, at 11:45 AM, Andrew Wang <an...@cloudera.com> wrote:

> Hi all,
> 
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
> 
> .....
> Thanks,
> Andrew


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
With 4 binding +1s, 3 non-binding +1s, no -1s, the vote passes. Thanks
everyone who gave feedback at this stage, particularly Sanjay and Suresh.
I should add that this vote will run for the standard 7 days for a
non-release vote, so will close at 12PM Pacific on August 15th.


On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort
> with many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
With 4 binding +1s, 3 non-binding +1s, no -1s, the vote passes. Thanks
everyone who gave feedback at this stage, particularly Sanjay and Suresh.
I should add that this vote will run for the standard 7 days for a
non-release vote, so will close at 12PM Pacific on August 15th.


On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort
> with many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

RE: [VOTE] Merge fs-encryption branch to trunk

Posted by "Liu, Yi A" <yi...@intel.com>.
+1 (non-binding)

I involved in feature development and participated in JIRA reviews in this branch.
With help from many committers/PMCs/contributors, this feature goes with good quality, I think it is ready for merge.

Regards,
Yi Liu

-----Original Message-----
From: Stephen Chu [mailto:schu@cloudera.com] 
Sent: Saturday, August 09, 2014 3:18 PM
To: common-dev@hadoop.apache.org
Cc: hdfs-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge fs-encryption branch to trunk

+1 (non-binding)

I have been testing encryption in conjunction with the Hadoop KMS and right now the integration looks good to merge to trunk. I also tested on platforms with outdated openssl and no encryption configs to verify no regressions when users don't want to use this feature.

Thanks to those who worked on this enhancement and fixed the bugs found in testing.

Stephen


On Fri, Aug 8, 2014 at 4:37 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> +1
>
> I've been following the work closely, specially on the crypto streams 
> and key handling, and providing dev support as well.
>
> Kudos to Andrew, Yi and Charles for doing the bulk of the work.
>
> thx
>
>
>
> On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > I should add that this vote will run for the standard 7 days for a 
> > non-release vote, so will close at 12PM Pacific on August 15th.
> >
> >
> > On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang 
> > <an...@cloudera.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > > Development of this feature has been ongoing since March on 
> > > HDFS-6134
> and
> > > HADOOP-10150, totally approximately 50 commits.
> > >
> > > The fs-encryption branch introduces support for transparent, 
> > > end-to-end encryption within an "encryption zone". Each file 
> > > stored within an encryption zone is automatically encrypted and 
> > > decrypted with a unique
> > key.
> > > These per-file keys are encrypted with an encryption key only
> accessible
> > by
> > > the client, ensuring that only the client is able to decrypt 
> > > sensitive data. Furthermore, there is support for native, 
> > > hardware-accelerated
> AES
> > > encryption. For further details, please see the design doc on
> HDFS-6134.
> > >
> > > In terms of merge readiness, we've posted some successful 
> > > consolidated patches to the JIRA for Jenkins runs. distcp and fs 
> > > -cp support has
> also
> > > recently been completed, allowing users to securely copy encrypted
> files
> > > without first decrypting them. There is ongoing work to add 
> > > support for WebHDFS, HttpFS, and other alternative access methods. 
> > > Stephen Chu has
> > also
> > > posted a test plan, and has already identified a few issues that 
> > > have
> > been
> > > fixed.
> > >
> > > Design and development of this feature was also a cross-company 
> > > effort with many different contributors.
> > >
> > > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> > McCabe,
> > > and Juan Yu for their code contributions and reviews. Alejandro
> Abdelnur
> > > was also instrumental, doing a lot of the design work and as well 
> > > as writing most of the Hadoop Key Mangement Server (KMS). Finally, 
> > > I'd
> like
> > to
> > > thank everyone who gave feedback on the JIRAs. This includes Owen,
> > Sanjay,
> > > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> > >
> > > With that, here's my +1 to merge this to trunk.
> > >
> > > Thanks,
> > > Andrew
> > >
> >
>
>
>
> --
> Alejandro
>

RE: [VOTE] Merge fs-encryption branch to trunk

Posted by "Liu, Yi A" <yi...@intel.com>.
+1 (non-binding)

I involved in feature development and participated in JIRA reviews in this branch.
With help from many committers/PMCs/contributors, this feature goes with good quality, I think it is ready for merge.

Regards,
Yi Liu

-----Original Message-----
From: Stephen Chu [mailto:schu@cloudera.com] 
Sent: Saturday, August 09, 2014 3:18 PM
To: common-dev@hadoop.apache.org
Cc: hdfs-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge fs-encryption branch to trunk

+1 (non-binding)

I have been testing encryption in conjunction with the Hadoop KMS and right now the integration looks good to merge to trunk. I also tested on platforms with outdated openssl and no encryption configs to verify no regressions when users don't want to use this feature.

Thanks to those who worked on this enhancement and fixed the bugs found in testing.

Stephen


On Fri, Aug 8, 2014 at 4:37 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> +1
>
> I've been following the work closely, specially on the crypto streams 
> and key handling, and providing dev support as well.
>
> Kudos to Andrew, Yi and Charles for doing the bulk of the work.
>
> thx
>
>
>
> On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > I should add that this vote will run for the standard 7 days for a 
> > non-release vote, so will close at 12PM Pacific on August 15th.
> >
> >
> > On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang 
> > <an...@cloudera.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > > Development of this feature has been ongoing since March on 
> > > HDFS-6134
> and
> > > HADOOP-10150, totally approximately 50 commits.
> > >
> > > The fs-encryption branch introduces support for transparent, 
> > > end-to-end encryption within an "encryption zone". Each file 
> > > stored within an encryption zone is automatically encrypted and 
> > > decrypted with a unique
> > key.
> > > These per-file keys are encrypted with an encryption key only
> accessible
> > by
> > > the client, ensuring that only the client is able to decrypt 
> > > sensitive data. Furthermore, there is support for native, 
> > > hardware-accelerated
> AES
> > > encryption. For further details, please see the design doc on
> HDFS-6134.
> > >
> > > In terms of merge readiness, we've posted some successful 
> > > consolidated patches to the JIRA for Jenkins runs. distcp and fs 
> > > -cp support has
> also
> > > recently been completed, allowing users to securely copy encrypted
> files
> > > without first decrypting them. There is ongoing work to add 
> > > support for WebHDFS, HttpFS, and other alternative access methods. 
> > > Stephen Chu has
> > also
> > > posted a test plan, and has already identified a few issues that 
> > > have
> > been
> > > fixed.
> > >
> > > Design and development of this feature was also a cross-company 
> > > effort with many different contributors.
> > >
> > > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> > McCabe,
> > > and Juan Yu for their code contributions and reviews. Alejandro
> Abdelnur
> > > was also instrumental, doing a lot of the design work and as well 
> > > as writing most of the Hadoop Key Mangement Server (KMS). Finally, 
> > > I'd
> like
> > to
> > > thank everyone who gave feedback on the JIRAs. This includes Owen,
> > Sanjay,
> > > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> > >
> > > With that, here's my +1 to merge this to trunk.
> > >
> > > Thanks,
> > > Andrew
> > >
> >
>
>
>
> --
> Alejandro
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Stephen Chu <sc...@cloudera.com>.
+1 (non-binding)

I have been testing encryption in conjunction with the Hadoop KMS and right
now the integration looks good to merge to trunk. I also tested on
platforms with outdated openssl and no encryption configs to verify no
regressions when users don't want to use this feature.

Thanks to those who worked on this enhancement and fixed the bugs found in
testing.

Stephen


On Fri, Aug 8, 2014 at 4:37 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> +1
>
> I've been following the work closely, specially on the crypto streams and
> key handling, and providing dev support as well.
>
> Kudos to Andrew, Yi and Charles for doing the bulk of the work.
>
> thx
>
>
>
> On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > I should add that this vote will run for the standard 7 days for a
> > non-release vote, so will close at 12PM Pacific on August 15th.
> >
> >
> > On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > > Development of this feature has been ongoing since March on HDFS-6134
> and
> > > HADOOP-10150, totally approximately 50 commits.
> > >
> > > The fs-encryption branch introduces support for transparent, end-to-end
> > > encryption within an "encryption zone". Each file stored within an
> > > encryption zone is automatically encrypted and decrypted with a unique
> > key.
> > > These per-file keys are encrypted with an encryption key only
> accessible
> > by
> > > the client, ensuring that only the client is able to decrypt sensitive
> > > data. Furthermore, there is support for native, hardware-accelerated
> AES
> > > encryption. For further details, please see the design doc on
> HDFS-6134.
> > >
> > > In terms of merge readiness, we've posted some successful consolidated
> > > patches to the JIRA for Jenkins runs. distcp and fs -cp support has
> also
> > > recently been completed, allowing users to securely copy encrypted
> files
> > > without first decrypting them. There is ongoing work to add support for
> > > WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has
> > also
> > > posted a test plan, and has already identified a few issues that have
> > been
> > > fixed.
> > >
> > > Design and development of this feature was also a cross-company effort
> > > with many different contributors.
> > >
> > > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> > McCabe,
> > > and Juan Yu for their code contributions and reviews. Alejandro
> Abdelnur
> > > was also instrumental, doing a lot of the design work and as well as
> > > writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd
> like
> > to
> > > thank everyone who gave feedback on the JIRAs. This includes Owen,
> > Sanjay,
> > > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> > >
> > > With that, here's my +1 to merge this to trunk.
> > >
> > > Thanks,
> > > Andrew
> > >
> >
>
>
>
> --
> Alejandro
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Stephen Chu <sc...@cloudera.com>.
+1 (non-binding)

I have been testing encryption in conjunction with the Hadoop KMS and right
now the integration looks good to merge to trunk. I also tested on
platforms with outdated openssl and no encryption configs to verify no
regressions when users don't want to use this feature.

Thanks to those who worked on this enhancement and fixed the bugs found in
testing.

Stephen


On Fri, Aug 8, 2014 at 4:37 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> +1
>
> I've been following the work closely, specially on the crypto streams and
> key handling, and providing dev support as well.
>
> Kudos to Andrew, Yi and Charles for doing the bulk of the work.
>
> thx
>
>
>
> On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > I should add that this vote will run for the standard 7 days for a
> > non-release vote, so will close at 12PM Pacific on August 15th.
> >
> >
> > On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > > Development of this feature has been ongoing since March on HDFS-6134
> and
> > > HADOOP-10150, totally approximately 50 commits.
> > >
> > > The fs-encryption branch introduces support for transparent, end-to-end
> > > encryption within an "encryption zone". Each file stored within an
> > > encryption zone is automatically encrypted and decrypted with a unique
> > key.
> > > These per-file keys are encrypted with an encryption key only
> accessible
> > by
> > > the client, ensuring that only the client is able to decrypt sensitive
> > > data. Furthermore, there is support for native, hardware-accelerated
> AES
> > > encryption. For further details, please see the design doc on
> HDFS-6134.
> > >
> > > In terms of merge readiness, we've posted some successful consolidated
> > > patches to the JIRA for Jenkins runs. distcp and fs -cp support has
> also
> > > recently been completed, allowing users to securely copy encrypted
> files
> > > without first decrypting them. There is ongoing work to add support for
> > > WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has
> > also
> > > posted a test plan, and has already identified a few issues that have
> > been
> > > fixed.
> > >
> > > Design and development of this feature was also a cross-company effort
> > > with many different contributors.
> > >
> > > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> > McCabe,
> > > and Juan Yu for their code contributions and reviews. Alejandro
> Abdelnur
> > > was also instrumental, doing a lot of the design work and as well as
> > > writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd
> like
> > to
> > > thank everyone who gave feedback on the JIRAs. This includes Owen,
> > Sanjay,
> > > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> > >
> > > With that, here's my +1 to merge this to trunk.
> > >
> > > Thanks,
> > > Andrew
> > >
> >
>
>
>
> --
> Alejandro
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
+1

I've been following the work closely, specially on the crypto streams and
key handling, and providing dev support as well.

Kudos to Andrew, Yi and Charles for doing the bulk of the work.

thx



On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
wrote:

> I should add that this vote will run for the standard 7 days for a
> non-release vote, so will close at 12PM Pacific on August 15th.
>
>
> On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > Development of this feature has been ongoing since March on HDFS-6134 and
> > HADOOP-10150, totally approximately 50 commits.
> >
> > The fs-encryption branch introduces support for transparent, end-to-end
> > encryption within an "encryption zone". Each file stored within an
> > encryption zone is automatically encrypted and decrypted with a unique
> key.
> > These per-file keys are encrypted with an encryption key only accessible
> by
> > the client, ensuring that only the client is able to decrypt sensitive
> > data. Furthermore, there is support for native, hardware-accelerated AES
> > encryption. For further details, please see the design doc on HDFS-6134.
> >
> > In terms of merge readiness, we've posted some successful consolidated
> > patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> > recently been completed, allowing users to securely copy encrypted files
> > without first decrypting them. There is ongoing work to add support for
> > WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has
> also
> > posted a test plan, and has already identified a few issues that have
> been
> > fixed.
> >
> > Design and development of this feature was also a cross-company effort
> > with many different contributors.
> >
> > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> McCabe,
> > and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> > was also instrumental, doing a lot of the design work and as well as
> > writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like
> to
> > thank everyone who gave feedback on the JIRAs. This includes Owen,
> Sanjay,
> > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> >
> > With that, here's my +1 to merge this to trunk.
> >
> > Thanks,
> > Andrew
> >
>



-- 
Alejandro

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
+1

I've been following the work closely, specially on the crypto streams and
key handling, and providing dev support as well.

Kudos to Andrew, Yi and Charles for doing the bulk of the work.

thx



On Fri, Aug 8, 2014 at 2:27 PM, Andrew Wang <an...@cloudera.com>
wrote:

> I should add that this vote will run for the standard 7 days for a
> non-release vote, so will close at 12PM Pacific on August 15th.
>
>
> On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > I'd like to call a vote to merge the fs-encryption branch to trunk.
> > Development of this feature has been ongoing since March on HDFS-6134 and
> > HADOOP-10150, totally approximately 50 commits.
> >
> > The fs-encryption branch introduces support for transparent, end-to-end
> > encryption within an "encryption zone". Each file stored within an
> > encryption zone is automatically encrypted and decrypted with a unique
> key.
> > These per-file keys are encrypted with an encryption key only accessible
> by
> > the client, ensuring that only the client is able to decrypt sensitive
> > data. Furthermore, there is support for native, hardware-accelerated AES
> > encryption. For further details, please see the design doc on HDFS-6134.
> >
> > In terms of merge readiness, we've posted some successful consolidated
> > patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> > recently been completed, allowing users to securely copy encrypted files
> > without first decrypting them. There is ongoing work to add support for
> > WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has
> also
> > posted a test plan, and has already identified a few issues that have
> been
> > fixed.
> >
> > Design and development of this feature was also a cross-company effort
> > with many different contributors.
> >
> > I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin
> McCabe,
> > and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> > was also instrumental, doing a lot of the design work and as well as
> > writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like
> to
> > thank everyone who gave feedback on the JIRAs. This includes Owen,
> Sanjay,
> > Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
> >
> > With that, here's my +1 to merge this to trunk.
> >
> > Thanks,
> > Andrew
> >
>



-- 
Alejandro

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
I should add that this vote will run for the standard 7 days for a
non-release vote, so will close at 12PM Pacific on August 15th.


On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort
> with many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Charles Lamb <cl...@cloudera.com>.
+1 (non-binding)

I've actively worked on developing and reviewing this feature and am 
happy to see it in its current state. I believe it is ready to be merged.

Charles


Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Charles Lamb <cl...@cloudera.com>.
On 8/8/2014 2:45 PM, Andrew Wang wrote:
> With that, here's my +1 to merge this to trunk.

1 down, 2 to go.


Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Uma Maheswara Rao G <ha...@gmail.com>.
Nice work guys!
+1 for merge.

Regards,
Uma


On Sat, Aug 9, 2014 at 12:15 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort with
> many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Andrew Wang <an...@cloudera.com>.
I should add that this vote will run for the standard 7 days for a
non-release vote, so will close at 12PM Pacific on August 15th.


On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort
> with many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by sanjay Radia <sa...@hortonworks.com>.
While I was originally skeptical of transparent encryption, I like the value proposition of transparent encryption. HDFS has several layers, protocols  and tools. While the HDFS core part seems to be well done in the Jira, inserting the matching transparency in the other tools or protocols need to be worked through.

I have the following areas of concern:
- Common protocols like webhdfs should continue to work (the design doc marks this as a goal), This issue is being discussed in the Jira but it appears that webhdfs does not currently work with encrypted files: Andrew say that "Regarding webhdfs, it's not a recommended deployment" and that he will modify the documentation to match that. Aljeandro say "Both httpfs and webhdfs will work just fine" but then in the same paragraph says "this could fail some security audits". We need to resolve this quickly. Webhdfs is heavily used by many Hadoop users.


- Common tools should like cp, distcp and HAR should continue  to work with non-encrypted and encrypted files in an automatic fashion. This issue has been heavily discussed in the Jira and at the meeting. The /.reserved./.raw mechanism appears to be a step in the right direction for distcp and cp, however this work has not reached its conclusion in my opinion; Charles are I are going through the use cases and I think we are close to a clean solution for distcp and cp.  HAR still needs a concrete proposal.

- KMS scalability in medium to large clusters. This can perhaps  be addressed by getting the keys ahead of time when a job is submitted.  Without this the  KMS will need to be as highly available and scalable as the NN.  I think this is future implementation work but we need to at least determine if this is indeed possible in case we need to modify some of the APIs right now to support that.

There are some other minor things under discussion, and I still need to go through the new APIs.

 Unfortunately at this stage I cannot give a +1 for this merge; I hope to change this in the next day or -  I am working with the Jira's team.  Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as possible.

Sanjay (binding)



On Aug 8, 2014, at 11:45 AM, Andrew Wang <an...@cloudera.com> wrote:

> Hi all,
> 
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
> 
> .....
> Thanks,
> Andrew


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Uma Maheswara Rao G <ha...@gmail.com>.
Nice work guys!
+1 for merge.

Regards,
Uma


On Sat, Aug 9, 2014 at 12:15 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort with
> many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Suresh Srinivas <su...@hortonworks.com>.
+1 (binding)

This is a very important feature. I have gone through the design in detail.
I have also discussed the details of design and  needed as follow up with
the team (I will post the details in the jira). The follow up items should
not hold up merging this feature to trunk and can be done in trunk.

Great job everyone who contributed to this feature!



On Fri, Aug 8, 2014 at 11:45 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> I'd like to call a vote to merge the fs-encryption branch to trunk.
> Development of this feature has been ongoing since March on HDFS-6134 and
> HADOOP-10150, totally approximately 50 commits.
>
> The fs-encryption branch introduces support for transparent, end-to-end
> encryption within an "encryption zone". Each file stored within an
> encryption zone is automatically encrypted and decrypted with a unique key.
> These per-file keys are encrypted with an encryption key only accessible by
> the client, ensuring that only the client is able to decrypt sensitive
> data. Furthermore, there is support for native, hardware-accelerated AES
> encryption. For further details, please see the design doc on HDFS-6134.
>
> In terms of merge readiness, we've posted some successful consolidated
> patches to the JIRA for Jenkins runs. distcp and fs -cp support has also
> recently been completed, allowing users to securely copy encrypted files
> without first decrypting them. There is ongoing work to add support for
> WebHDFS, HttpFS, and other alternative access methods. Stephen Chu has also
> posted a test plan, and has already identified a few issues that have been
> fixed.
>
> Design and development of this feature was also a cross-company effort with
> many different contributors.
>
> I'd like to thank Charles Lamb, Yi Liu, Uma Maheswara Rao G, Colin McCabe,
> and Juan Yu for their code contributions and reviews. Alejandro Abdelnur
> was also instrumental, doing a lot of the design work and as well as
> writing most of the Hadoop Key Mangement Server (KMS). Finally, I'd like to
> thank everyone who gave feedback on the JIRAs. This includes Owen, Sanjay,
> Larry, Mike Y, ATM, Todd, Nicholas, and Andy, among others.
>
> With that, here's my +1 to merge this to trunk.
>
> Thanks,
> Andrew
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Merge fs-encryption branch to trunk

Posted by Charles Lamb <cl...@cloudera.com>.
+1 (non-binding)

I've actively worked on developing and reviewing this feature and am 
happy to see it in its current state. I believe it is ready to be merged.

Charles