You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Matthew O'Reilly <mo...@qub.ac.uk> on 2015/07/31 10:17:27 UTC

Encryption on RDDs or in-memory/cache on Apache Spark

Hi, 

I am currently working on the latest version of Apache Spark (1.4.1), pre-built package for Hadoop 2.6+.

Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory/cache (something similar is Altibase's HDB: http://altibase.com/in-memory-database-computing-solutions/security/) when running applications in Spark? Or is there an external library/framework which could be used to encrypt RDDs or in-memory/cache in Spark?

I discovered it is possible to encrypt the data, and encapsulate it into RDD. However, I feel this affects Spark's fast data processing as it is slower to encrypt the data, and then encapsulate it to RDD; it's then a two step process. Encryption and storing data should be done parallel.

Any help would be appreciated.

Many thanks,
Matthew


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Encryption on RDDs or in-memory/cache on Apache Spark

Posted by Jörn Franke <jo...@gmail.com>.
I think you use case can already be implemented with HDFS encryption and/or
SealedObject, if you look for sth like Altibase.

If you create a JIRA you may want to set the bar a little bit higher and
propose sth like MIT cryptdb: https://css.csail.mit.edu/cryptdb/

Le ven. 31 juil. 2015 à 10:17, Matthew O'Reilly <mo...@qub.ac.uk> a
écrit :

> Hi,
>
> I am currently working on the latest version of Apache Spark (1.4.1),
> pre-built package for Hadoop 2.6+.
>
> Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory/cache
> (something similar is Altibase's HDB:
> http://altibase.com/in-memory-database-computing-solutions/security/)
> when running applications in Spark? Or is there an external
> library/framework which could be used to encrypt RDDs or in-memory/cache in
> Spark?
>
> I discovered it is possible to encrypt the data, and encapsulate it into
> RDD. However, I feel this affects Spark's fast data processing as it is
> slower to encrypt the data, and then encapsulate it to RDD; it's then a two
> step process. Encryption and storing data should be done parallel.
>
> Any help would be appreciated.
>
> Many thanks,
> Matthew
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Encryption on RDDs or in-memory/cache on Apache Spark

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Currently RDDs are not encrypted, I think you can go ahead and open a JIRA
to add this feature and may be in future release it could be added.

Thanks
Best Regards

On Fri, Jul 31, 2015 at 1:47 PM, Matthew O'Reilly <mo...@qub.ac.uk>
wrote:

> Hi,
>
> I am currently working on the latest version of Apache Spark (1.4.1),
> pre-built package for Hadoop 2.6+.
>
> Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory/cache
> (something similar is Altibase's HDB:
> http://altibase.com/in-memory-database-computing-solutions/security/)
> when running applications in Spark? Or is there an external
> library/framework which could be used to encrypt RDDs or in-memory/cache in
> Spark?
>
> I discovered it is possible to encrypt the data, and encapsulate it into
> RDD. However, I feel this affects Spark's fast data processing as it is
> slower to encrypt the data, and then encapsulate it to RDD; it's then a two
> step process. Encryption and storing data should be done parallel.
>
> Any help would be appreciated.
>
> Many thanks,
> Matthew
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>