You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Koert Kuipers <ko...@tresata.com> on 2012/01/20 22:09:02 UTC

encryption

Does anyone know of any work/ideas to encrypt data stored on hdfs?
Ideally both temporary files and final files would be encrypted. Or there
would have to be a mechanism in hdfs to securely wipe temporary files, like
shred in linux.

So far this is what i found:
https://github.com/geisbruch/HadoopCryptoCompressor

Best,
Koert

Re: encryption

Posted by Mac Noland <mc...@yahoo.com>.
Right or wrong, we pretty much follow an architectural model to encrypt data before it gets to the storage system and decrypt it after pulling it off.  


While it's some extra gear in your data center, it's given us flexibility to use whatever storage system we wanted to use.  HDFS, NAS, SAN, Local, Oracle, MSSQL, MySQL.


We use this appliance.  http://www.safenet-inc.com/products/data-protection/data-encryption-control/datasecure/

No, I don't work for Safe Net.



________________________________
 From: Koert Kuipers <ko...@tresata.com>
To: hdfs-user@hadoop.apache.org 
Sent: Friday, January 20, 2012 3:09 PM
Subject: encryption
 

Does anyone know of any work/ideas to encrypt data stored on hdfs?
Ideally both temporary files and final files would be encrypted. Or there would have to be a mechanism in hdfs to securely wipe temporary files, like shred in linux.

So far this is what i found:
https://github.com/geisbruch/HadoopCryptoCompressor

Best,
Koert

RE: encryption

Posted by Tim Broberg <Ti...@exar.com>.
There has been discussion of doing a "compression" codec that does encryption, but the key just gets stuffed in a config object.

Key management is the hard part if you want to "do it right."

    - Tim.
________________________________
From: Charles Earl [charlescearl@me.com]
Sent: Friday, January 20, 2012 1:14 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: encryption

Koert,
Slightly related but not exactly is a Tahoe-LAFS plugin for hadoop
  http://code.google.com/p/hadoop-lafs
I believe there has been discussion in this list on an encryption codec.
C
On Jan 20, 2012, at 4:09 PM, Koert Kuipers wrote:

Does anyone know of any work/ideas to encrypt data stored on hdfs?
Ideally both temporary files and final files would be encrypted. Or there would have to be a mechanism in hdfs to securely wipe temporary files, like shred in linux.

So far this is what i found:
https://github.com/geisbruch/HadoopCryptoCompressor

Best,
Koert


________________________________
The information and any attached documents contained in this message
may be confidential and/or legally privileged. The message is
intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful. If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

Re: encryption

Posted by Charles Earl <ch...@me.com>.
Koert,
Slightly related but not exactly is a Tahoe-LAFS plugin for hadoop
  http://code.google.com/p/hadoop-lafs
I believe there has been discussion in this list on an encryption codec.
C
On Jan 20, 2012, at 4:09 PM, Koert Kuipers wrote:

> Does anyone know of any work/ideas to encrypt data stored on hdfs?
> Ideally both temporary files and final files would be encrypted. Or there would have to be a mechanism in hdfs to securely wipe temporary files, like shred in linux.
> 
> So far this is what i found:
> https://github.com/geisbruch/HadoopCryptoCompressor
> 
> Best,
> Koert


Re: encryption

Posted by Ted Dunning <td...@maprtech.com>.
Or just people who find your disks at the second-hand shop.

http://www.wavy.com/dpp/news/military/tricare-beneficiaries'-data-stolen

On Fri, Jan 20, 2012 at 3:36 PM, Tim Broberg <Ti...@exar.com> wrote:

>  I guess the first question is the threat model: What kind of bad guy are
> you trying to keep out? Is Ukrainian hackers? Local users, but the servers
> are locked up? Is it somebody who has physical access to the machines? Does
> the information have to be secure forever or just for a while?
>
> Once you know what you're trying to protect from, you can start thinking
> about how to protect yourself.
>
>     - Tim.
>  ------------------------------
> *From:* Koert Kuipers [koert@tresata.com]
> *Sent:* Friday, January 20, 2012 1:09 PM
> *To:* hdfs-user@hadoop.apache.org
> *Subject:* encryption
>
>  Does anyone know of any work/ideas to encrypt data stored on hdfs?
> Ideally both temporary files and final files would be encrypted. Or there
> would have to be a mechanism in hdfs to securely wipe temporary files, like
> shred in linux.
>
> So far this is what i found:
> https://github.com/geisbruch/HadoopCryptoCompressor
>
> Best,
> Koert
>
> ------------------------------
> The information and any attached documents contained in this message
> may be confidential and/or legally privileged. The message is
> intended solely for the addressee(s). If you are not the intended
> recipient, you are hereby notified that any use, dissemination, or
> reproduction is strictly prohibited and may be unlawful. If you are
> not the intended recipient, please contact the sender immediately by
> return e-mail and destroy all copies of the original message.
>

RE: encryption

Posted by Tim Broberg <Ti...@exar.com>.
Protecting against the guy who has physical access to the servers and all the time in the world is the nightmare case because he has the keys in his possession.

That's where you start buying expensive FIPS-140 cryptomodules that keep the keys in a tight little box that self-destructs when opened.

So, you can

 1.
Ignore the problem
 2.
Just do something with the word "encryption in it" to make his job a little harder and dispel questions from above about security
 3.
Spend some very serious time and money trying to solve the problem in earnest

I would recommend taking the lowest number you can get away with on the list.

I'm not a hadoop expert yet, but what you found is the best approach I'm aware of to #2.

    - Tim.

________________________________
From: Koert Kuipers [koert@tresata.com]
Sent: Friday, January 20, 2012 3:55 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: encryption

agreed.

many forms of data require encryption to be stored on any system. i do know now the exact motivation(s) for that, but i do know we have to conform to this.

my assumption was that i want to protect against access to the data by someone stealing the harddrives or the servers. so physical access. it seems to me that there are better ways to protect agains digital access (firewall in front of cluster). but then again, i do now know much about this at all so i could completely off.

On Fri, Jan 20, 2012 at 6:36 PM, Tim Broberg <Ti...@exar.com>> wrote:
I guess the first question is the threat model: What kind of bad guy are you trying to keep out? Is Ukrainian hackers? Local users, but the servers are locked up? Is it somebody who has physical access to the machines? Does the information have to be secure forever or just for a while?

Once you know what you're trying to protect from, you can start thinking about how to protect yourself.

    - Tim.
________________________________
From: Koert Kuipers [koert@tresata.com<ma...@tresata.com>]
Sent: Friday, January 20, 2012 1:09 PM
To: hdfs-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: encryption

Does anyone know of any work/ideas to encrypt data stored on hdfs?
Ideally both temporary files and final files would be encrypted. Or there would have to be a mechanism in hdfs to securely wipe temporary files, like shred in linux.

So far this is what i found:
https://github.com/geisbruch/HadoopCryptoCompressor

Best,
Koert

________________________________
The information and any attached documents contained in this message
may be confidential and/or legally privileged. The message is
intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful. If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.


________________________________
The information and any attached documents contained in this message
may be confidential and/or legally privileged. The message is
intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful. If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

Re: encryption

Posted by Koert Kuipers <ko...@tresata.com>.
agreed.

many forms of data require encryption to be stored on any system. i do know
now the exact motivation(s) for that, but i do know we have to conform to
this.

my assumption was that i want to protect against access to the data by
someone stealing the harddrives or the servers. so physical access. it
seems to me that there are better ways to protect agains digital access
(firewall in front of cluster). but then again, i do now know much about
this at all so i could completely off.

On Fri, Jan 20, 2012 at 6:36 PM, Tim Broberg <Ti...@exar.com> wrote:

>  I guess the first question is the threat model: What kind of bad guy are
> you trying to keep out? Is Ukrainian hackers? Local users, but the servers
> are locked up? Is it somebody who has physical access to the machines? Does
> the information have to be secure forever or just for a while?
>
> Once you know what you're trying to protect from, you can start thinking
> about how to protect yourself.
>
>     - Tim.
>  ------------------------------
> *From:* Koert Kuipers [koert@tresata.com]
> *Sent:* Friday, January 20, 2012 1:09 PM
> *To:* hdfs-user@hadoop.apache.org
> *Subject:* encryption
>
>  Does anyone know of any work/ideas to encrypt data stored on hdfs?
> Ideally both temporary files and final files would be encrypted. Or there
> would have to be a mechanism in hdfs to securely wipe temporary files, like
> shred in linux.
>
> So far this is what i found:
> https://github.com/geisbruch/HadoopCryptoCompressor
>
> Best,
> Koert
>
> ------------------------------
> The information and any attached documents contained in this message
> may be confidential and/or legally privileged. The message is
> intended solely for the addressee(s). If you are not the intended
> recipient, you are hereby notified that any use, dissemination, or
> reproduction is strictly prohibited and may be unlawful. If you are
> not the intended recipient, please contact the sender immediately by
> return e-mail and destroy all copies of the original message.
>

RE: encryption

Posted by Tim Broberg <Ti...@exar.com>.
I guess the first question is the threat model: What kind of bad guy are you trying to keep out? Is Ukrainian hackers? Local users, but the servers are locked up? Is it somebody who has physical access to the machines? Does the information have to be secure forever or just for a while?

Once you know what you're trying to protect from, you can start thinking about how to protect yourself.

    - Tim.
________________________________
From: Koert Kuipers [koert@tresata.com]
Sent: Friday, January 20, 2012 1:09 PM
To: hdfs-user@hadoop.apache.org
Subject: encryption

Does anyone know of any work/ideas to encrypt data stored on hdfs?
Ideally both temporary files and final files would be encrypted. Or there would have to be a mechanism in hdfs to securely wipe temporary files, like shred in linux.

So far this is what i found:
https://github.com/geisbruch/HadoopCryptoCompressor

Best,
Koert

________________________________
The information and any attached documents contained in this message
may be confidential and/or legally privileged. The message is
intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful. If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

Re: encryption

Posted by Joshua Smith <jo...@rationalpi.com>.
Koert-

I saw a presentation at Hadoop World 2011 where some guys were co-opting
Hadoop's compression hooks to do encryption.

http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-security-considerations-for-hadoop-deployments?form=complete

http://www.cloudera.com/resource/hadoop-world-2011-presentation-slides-security-considerations-for-hadoop-deployments

Josh
On Jan 20, 2012 4:09 PM, "Koert Kuipers" <ko...@tresata.com> wrote:

> Does anyone know of any work/ideas to encrypt data stored on hdfs?
> Ideally both temporary files and final files would be encrypted. Or there
> would have to be a mechanism in hdfs to securely wipe temporary files, like
> shred in linux.
>
> So far this is what i found:
> https://github.com/geisbruch/HadoopCryptoCompressor
>
> Best,
> Koert
>