You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Daniel Howard <da...@toldme.com> on 2020/01/06 19:45:55 UTC

Questions about Setting Up Encryption

Hello,

I am working on getting Hadoop running within our organization. Our
high-level use case is to be able to say we're running with end-to-end
encryption. It looks like there are two major strategies for getting this
done in Hadoop:

A) HDFS Transparent Encryption:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
B) Secure Mode:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html

In our case, we are less concerned with Kerberos' user-level
authentication, but we want node-to-node encryption and encryption-at-rest.
With cluster applications, I typically achieve encryption-at-rest with LUKS
and then enable an application's TLS settings to achieve
encryption-in-motion.

What is my best strategy for Hadoop? Here are a couple of questions:

1) The docs say I have to create a new directory, but can I configure HDFS
Transparent Encryption to operate across an entire volume?
2) If I just need encrypted-in-motion, can I do just the "Data
confidentiality" part of the Secure Mode doc, or does that depend on
setting up Kerberos?

Thank You!
-danny

-- 
http://dannyman.toldme.com

Re: Questions about Setting Up Encryption

Posted by Bear Giles <bg...@snaplogic.com>.
The broader context is that some countries have very tight restrictions on
the use of encryption. Strong encryption can be used for authentication but
not data transmission. It sounds like a paradox but authentication can use
one-way functions like HMAC - you can't use it as a back channel for
encrypted traffic.

I couldn't recall if some form of encryption was automatically added in
countries that support it - I've worked on a number of tasks where our PM
and/or customers seemed to believe it was.

On Thu, Jan 9, 2020 at 3:38 AM Antonio Rendina <ar...@gmail.com> wrote:

> That's because they serves two different purposes. Kerberos is about
> authentication, it authenticates the users that have access to the
> services, "encryption in motion" encrypts every communication between
> clients and servers, but it has nothing to do about authentication.
>
> For example if you configure encryption without kerberos you will have
> "free" encrypted access, meaning: the communication on the wire will be
> encrypted, but the users that have services access will be managed at
> operating system level.
>
> Regards
> On 08/01/2020 18:42, Bear Giles wrote:
>
> > does that depend on setting up Kerberos
>
> Careful - Kerberos and data-in-motion encryption are orthogonal. If you
> use kerberos without also setting up TLS then the payload will be in the
> clear.
>
> (At least that's my understanding of how they work together.)
>
> Bear
>
> On Mon, Jan 6, 2020 at 9:45 PM Hariharan <ha...@gmail.com> wrote:
>
>> For 1) you can set up transparent encryption at the root directory level
>> for HDFS. However this works at file level and not volume level. For volume
>> level encryption you will have to use something like LUKS only.
>>
>> For 2), in addition to the steps mentioned in "data confidentiality", you
>> may also need to set up Encrypted Shuffle
>> <https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html>,
>> depending on your use-cases for Hadoop.
>>
>> Thanks,
>> Hariharan
>>
>> On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:
>> >
>> > Hello,
>> >
>> > I am working on getting Hadoop running within our organization. Our
>> high-level use case is to be able to say we're running with end-to-end
>> encryption. It looks like there are two major strategies for getting this
>> done in Hadoop:
>> >
>> > A) HDFS Transparent Encryption:
>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>> > B) Secure Mode:
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>> >
>> > In our case, we are less concerned with Kerberos' user-level
>> authentication, but we want node-to-node encryption and encryption-at-rest.
>> With cluster applications, I typically achieve encryption-at-rest with LUKS
>> and then enable an application's TLS settings to achieve
>> encryption-in-motion.
>> >
>> > What is my best strategy for Hadoop? Here are a couple of questions:
>> >
>> > 1) The docs say I have to create a new directory, but can I configure
>> HDFS Transparent Encryption to operate across an entire volume?
>> > 2) If I just need encrypted-in-motion, can I do just the "Data
>> confidentiality" part of the Secure Mode doc, or does that depend on
>> setting up Kerberos?
>> >
>> > Thank You!
>> > -danny
>> >
>> > --
>> > http://dannyman.toldme.com
>>
>>
>> On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:
>>
>>> Hello,
>>>
>>> I am working on getting Hadoop running within our organization. Our
>>> high-level use case is to be able to say we're running with end-to-end
>>> encryption. It looks like there are two major strategies for getting this
>>> done in Hadoop:
>>>
>>> A) HDFS Transparent Encryption:
>>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>>> B) Secure Mode:
>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>>>
>>> In our case, we are less concerned with Kerberos' user-level
>>> authentication, but we want node-to-node encryption and encryption-at-rest.
>>> With cluster applications, I typically achieve encryption-at-rest with LUKS
>>> and then enable an application's TLS settings to achieve
>>> encryption-in-motion.
>>>
>>> What is my best strategy for Hadoop? Here are a couple of questions:
>>>
>>> 1) The docs say I have to create a new directory, but can I configure
>>> HDFS Transparent Encryption to operate across an entire volume?
>>> 2) If I just need encrypted-in-motion, can I do just the "Data
>>> confidentiality" part of the Secure Mode doc, or does that depend on
>>> setting up Kerberos?
>>>
>>> Thank You!
>>> -danny
>>>
>>> --
>>> http://dannyman.toldme.com
>>>
>>
>
> --
>
> Bear Giles
>
> Sr. Software Engineer
> bgiles@snaplogic.com
> Mobile: 720-749-7876
>
>
> <http://www.snaplogic.com/about-us/jobs>
>
>
>
> *SnapLogic Inc | 1825 South Grant Street | San Mateo CA | USA   *
>
>
> This message is confidential. It may also be privileged or otherwise
> protected by work product immunity or other legal rules. If you have
> received it by mistake, please let us know by e-mail reply and delete it
> from your system; you may not copy this message or disclose its contents to
> anyone. The integrity and security of this message cannot be guaranteed on
> the Internet.
>
>

-- 

Bear Giles

Sr. Software Engineer
bgiles@snaplogic.com
Mobile: 720-749-7876


<http://www.snaplogic.com/about-us/jobs>



*SnapLogic Inc | 1825 South Grant Street | San Mateo CA | USA   *


This message is confidential. It may also be privileged or otherwise
protected by work product immunity or other legal rules. If you have
received it by mistake, please let us know by e-mail reply and delete it
from your system; you may not copy this message or disclose its contents to
anyone. The integrity and security of this message cannot be guaranteed on
the Internet.

Re: Questions about Setting Up Encryption

Posted by Antonio Rendina <ar...@gmail.com>.
That's because they serves two different purposes. Kerberos is about 
authentication, it authenticates the users that have access to the 
services, "encryption in motion" encrypts every communication between 
clients and servers, but it has nothing to do about authentication.

For example if you configure encryption without kerberos you will have 
"free" encrypted access, meaning: the communication on the wire will be 
encrypted, but the users that have services access will be managed at 
operating system level.

Regards

On 08/01/2020 18:42, Bear Giles wrote:
> > does that depend on setting up Kerberos
>
> Careful - Kerberos and data-in-motion encryption are orthogonal. If 
> you use kerberos without also setting up TLS then the payload will be 
> in the clear.
>
> (At least that's my understanding of how they work together.)
>
> Bear
>
> On Mon, Jan 6, 2020 at 9:45 PM Hariharan <hariharan022@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     For 1) you can set up transparent encryption at the root directory
>     level for HDFS. However this works at file level and not volume
>     level. For volume level encryption you will have to use something
>     like LUKS only.
>
>     For 2), in addition to the steps mentioned in "data
>     confidentiality", you may also need to set up Encrypted Shuffle
>     <https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html>,
>     depending on your use-cases for Hadoop.
>
>     Thanks,
>     Hariharan
>
>     On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <dannyman@toldme.com
>     <ma...@toldme.com>> wrote:
>     >
>     > Hello,
>     >
>     > I am working on getting Hadoop running within our organization.
>     Our high-level use case is to be able to say we're running with
>     end-to-end encryption. It looks like there are two major
>     strategies for getting this done in Hadoop:
>     >
>     > A) HDFS Transparent Encryption:
>     https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>     > B) Secure Mode:
>     https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>     >
>     > In our case, we are less concerned with Kerberos' user-level
>     authentication, but we want node-to-node encryption and
>     encryption-at-rest. With cluster applications, I typically achieve
>     encryption-at-rest with LUKS and then enable an application's TLS
>     settings to achieve encryption-in-motion.
>     >
>     > What is my best strategy for Hadoop? Here are a couple of questions:
>     >
>     > 1) The docs say I have to create a new directory, but can I
>     configure HDFS Transparent Encryption to operate across an entire
>     volume?
>     > 2) If I just need encrypted-in-motion, can I do just the "Data
>     confidentiality" part of the Secure Mode doc, or does that depend
>     on setting up Kerberos?
>     >
>     > Thank You!
>     > -danny
>     >
>     > --
>     > http://dannyman.toldme.com
>
>
>     On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <dannyman@toldme.com
>     <ma...@toldme.com>> wrote:
>
>         Hello,
>
>         I am working on getting Hadoop running within our
>         organization. Our high-level use case is to be able to say
>         we're running with end-to-end encryption. It looks like there
>         are two major strategies for getting this done in Hadoop:
>
>         A) HDFS Transparent Encryption:
>         https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>         B) Secure Mode:
>         https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>
>         In our case, we are less concerned with Kerberos' user-level
>         authentication, but we want node-to-node encryption and
>         encryption-at-rest. With cluster applications, I typically
>         achieve encryption-at-rest with LUKS and then enable an
>         application's TLS settings to achieve encryption-in-motion.
>
>         What is my best strategy for Hadoop? Here are a couple of
>         questions:
>
>         1) The docs say I have to create a new directory, but can I
>         configure HDFS Transparent Encryption to operate across an
>         entire volume?
>         2) If I just need encrypted-in-motion, can I do just the "Data
>         confidentiality" part of the Secure Mode doc, or does that
>         depend on setting up Kerberos?
>
>         Thank You!
>         -danny
>
>         -- 
>         http://dannyman.toldme.com
>
>
>
> -- 
>
> Bear Giles
>
> Sr. SoftwareEngineer
> bgiles@snaplogic.com <ma...@snaplogic.com>
> Mobile: 720-749-7876
>
>
> <http://www.snaplogic.com/about-us/jobs>
>
> 	
>
>
> *SnapLogic Inc | 1825 South Grant Street | San Mateo CA | USA *
>
> This message is confidential. It may also be privileged or otherwise 
> protected by work product immunity or other legal rules. If you have 
> received it by mistake, please let us know by e-mail reply and delete 
> it from your system; you may not copy this message or disclose its 
> contents to anyone. The integrity and security of this message cannot 
> be guaranteed on the Internet.
>
>

Re: Questions about Setting Up Encryption

Posted by Bear Giles <bg...@snaplogic.com>.
> does that depend on setting up Kerberos

Careful - Kerberos and data-in-motion encryption are orthogonal. If you use
kerberos without also setting up TLS then the payload will be in the clear.

(At least that's my understanding of how they work together.)

Bear

On Mon, Jan 6, 2020 at 9:45 PM Hariharan <ha...@gmail.com> wrote:

> For 1) you can set up transparent encryption at the root directory level
> for HDFS. However this works at file level and not volume level. For volume
> level encryption you will have to use something like LUKS only.
>
> For 2), in addition to the steps mentioned in "data confidentiality", you
> may also need to set up Encrypted Shuffle
> <https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html>,
> depending on your use-cases for Hadoop.
>
> Thanks,
> Hariharan
>
> On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:
> >
> > Hello,
> >
> > I am working on getting Hadoop running within our organization. Our
> high-level use case is to be able to say we're running with end-to-end
> encryption. It looks like there are two major strategies for getting this
> done in Hadoop:
> >
> > A) HDFS Transparent Encryption:
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
> > B) Secure Mode:
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
> >
> > In our case, we are less concerned with Kerberos' user-level
> authentication, but we want node-to-node encryption and encryption-at-rest.
> With cluster applications, I typically achieve encryption-at-rest with LUKS
> and then enable an application's TLS settings to achieve
> encryption-in-motion.
> >
> > What is my best strategy for Hadoop? Here are a couple of questions:
> >
> > 1) The docs say I have to create a new directory, but can I configure
> HDFS Transparent Encryption to operate across an entire volume?
> > 2) If I just need encrypted-in-motion, can I do just the "Data
> confidentiality" part of the Secure Mode doc, or does that depend on
> setting up Kerberos?
> >
> > Thank You!
> > -danny
> >
> > --
> > http://dannyman.toldme.com
>
>
> On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:
>
>> Hello,
>>
>> I am working on getting Hadoop running within our organization. Our
>> high-level use case is to be able to say we're running with end-to-end
>> encryption. It looks like there are two major strategies for getting this
>> done in Hadoop:
>>
>> A) HDFS Transparent Encryption:
>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
>> B) Secure Mode:
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>>
>> In our case, we are less concerned with Kerberos' user-level
>> authentication, but we want node-to-node encryption and encryption-at-rest.
>> With cluster applications, I typically achieve encryption-at-rest with LUKS
>> and then enable an application's TLS settings to achieve
>> encryption-in-motion.
>>
>> What is my best strategy for Hadoop? Here are a couple of questions:
>>
>> 1) The docs say I have to create a new directory, but can I configure
>> HDFS Transparent Encryption to operate across an entire volume?
>> 2) If I just need encrypted-in-motion, can I do just the "Data
>> confidentiality" part of the Secure Mode doc, or does that depend on
>> setting up Kerberos?
>>
>> Thank You!
>> -danny
>>
>> --
>> http://dannyman.toldme.com
>>
>

-- 

Bear Giles

Sr. Software Engineer
bgiles@snaplogic.com
Mobile: 720-749-7876


<http://www.snaplogic.com/about-us/jobs>



*SnapLogic Inc | 1825 South Grant Street | San Mateo CA | USA   *


This message is confidential. It may also be privileged or otherwise
protected by work product immunity or other legal rules. If you have
received it by mistake, please let us know by e-mail reply and delete it
from your system; you may not copy this message or disclose its contents to
anyone. The integrity and security of this message cannot be guaranteed on
the Internet.

Re: Questions about Setting Up Encryption

Posted by Hariharan <ha...@gmail.com>.
For 1) you can set up transparent encryption at the root directory level
for HDFS. However this works at file level and not volume level. For volume
level encryption you will have to use something like LUKS only.

For 2), in addition to the steps mentioned in "data confidentiality", you
may also need to set up Encrypted Shuffle
<https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html>,
depending on your use-cases for Hadoop.

Thanks,
Hariharan

On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:
>
> Hello,
>
> I am working on getting Hadoop running within our organization. Our
high-level use case is to be able to say we're running with end-to-end
encryption. It looks like there are two major strategies for getting this
done in Hadoop:
>
> A) HDFS Transparent Encryption:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
> B) Secure Mode:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>
> In our case, we are less concerned with Kerberos' user-level
authentication, but we want node-to-node encryption and encryption-at-rest.
With cluster applications, I typically achieve encryption-at-rest with LUKS
and then enable an application's TLS settings to achieve
encryption-in-motion.
>
> What is my best strategy for Hadoop? Here are a couple of questions:
>
> 1) The docs say I have to create a new directory, but can I configure
HDFS Transparent Encryption to operate across an entire volume?
> 2) If I just need encrypted-in-motion, can I do just the "Data
confidentiality" part of the Secure Mode doc, or does that depend on
setting up Kerberos?
>
> Thank You!
> -danny
>
> --
> http://dannyman.toldme.com


On Tue, Jan 7, 2020 at 1:16 AM Daniel Howard <da...@toldme.com> wrote:

> Hello,
>
> I am working on getting Hadoop running within our organization. Our
> high-level use case is to be able to say we're running with end-to-end
> encryption. It looks like there are two major strategies for getting this
> done in Hadoop:
>
> A) HDFS Transparent Encryption:
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
> B) Secure Mode:
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
>
> In our case, we are less concerned with Kerberos' user-level
> authentication, but we want node-to-node encryption and encryption-at-rest.
> With cluster applications, I typically achieve encryption-at-rest with LUKS
> and then enable an application's TLS settings to achieve
> encryption-in-motion.
>
> What is my best strategy for Hadoop? Here are a couple of questions:
>
> 1) The docs say I have to create a new directory, but can I configure HDFS
> Transparent Encryption to operate across an entire volume?
> 2) If I just need encrypted-in-motion, can I do just the "Data
> confidentiality" part of the Secure Mode doc, or does that depend on
> setting up Kerberos?
>
> Thank You!
> -danny
>
> --
> http://dannyman.toldme.com
>