You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sergio Pena <se...@cloudera.com> on 2015/05/07 23:23:17 UTC

Review Request 33956: HIVE-9614: Encrypt mapjoin tables

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33956/
-----------------------------------------------------------

Review request for hive, Brock Noland and cheng xu.


Bugs: HIVE-9614
    https://issues.apache.org/jira/browse/HIVE-9614


Repository: hive-git


Description
-------

The security issue here is that encrypted tables used on MAP-JOIN queries, and stored on the distribute cache, are first copied to the client local filesystem in an unencrypted form in order to compress it there.

This patch avoids the local copy if the table is encrypted on HDFS. It keeps the hash table on HDFS, compresses the table in HDFS, and then adds it to the distributed cache.

Files that are copied to the datanodes by the distributed cache are still unencrypted. This is a limitation we have from HDFS.


Diffs
-----

  common/src/java/org/apache/hadoop/hive/common/CompressionUtils.java 0e0d538c2faf1c52c4d8378df013294ae4efa41c 
  common/src/java/org/apache/hive/common/util/HdfsEncryptionUtilities.java PRE-CREATION 
  itests/src/test/resources/testconfiguration.properties 3eff7d010923a4e07d5024904f1531ca52473aa2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ad5c8f8302de2a15b1703161799f71cd81a94475 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java d7a08ecf1c183fe56b5ca41c2c69d413874418bb 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f76ce17711077ceadf23e6b9ed12e6a414 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b69df3871bbcc870af286774aee5269668b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cbc5466261f749fe7b84d7533dc0ff3274b6777f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 82143a64db163da766dcc138231b4d4174603470 
  ql/src/test/queries/clientpositive/encryption_map_join_select.q PRE-CREATION 
  ql/src/test/results/clientpositive/encrypted/encryption_map_join_select.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/33956/diff/


Testing
-------


Thanks,

Sergio Pena


Re: Review Request 33956: HIVE-9614: Encrypt mapjoin tables

Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33956/#review84671
-----------------------------------------------------------


Thank you for this patch. I have some questions and will have another round of review after understanding these questions. Thank you!


common/src/java/org/apache/hive/common/util/HdfsEncryptionUtilities.java
<https://reviews.apache.org/r/33956/#comment136026>

    Why not use the isPathEncrypted from HdfsEncryptionShim directly?



common/src/java/org/apache/hive/common/util/HdfsEncryptionUtilities.java
<https://reviews.apache.org/r/33956/#comment136027>

    The same as above.



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
<https://reviews.apache.org/r/33956/#comment136025>

    Is it possible to get the FsPermission from org.apache.hadoop.fs.FileContext?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java
<https://reviews.apache.org/r/33956/#comment136022>

    I am a little confused here. How can a local path be converted to a hdfs path? The original code is trying to create a tar file from a local path and uploading it to the hdfs with replication information. The new code path will lose the replication information. And the previous code path will only be executed in a local file or pfile schema in test.



ql/src/test/queries/clientpositive/encryption_map_join_select.q
<https://reviews.apache.org/r/33956/#comment136021>

    drop table encryptedTable PURGE;


- cheng xu


On May 7, 2015, 9:23 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33956/
> -----------------------------------------------------------
> 
> (Updated May 7, 2015, 9:23 p.m.)
> 
> 
> Review request for hive, Brock Noland and cheng xu.
> 
> 
> Bugs: HIVE-9614
>     https://issues.apache.org/jira/browse/HIVE-9614
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The security issue here is that encrypted tables used on MAP-JOIN queries, and stored on the distribute cache, are first copied to the client local filesystem in an unencrypted form in order to compress it there.
> 
> This patch avoids the local copy if the table is encrypted on HDFS. It keeps the hash table on HDFS, compresses the table in HDFS, and then adds it to the distributed cache.
> 
> Files that are copied to the datanodes by the distributed cache are still unencrypted. This is a limitation we have from HDFS.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/CompressionUtils.java 0e0d538c2faf1c52c4d8378df013294ae4efa41c 
>   common/src/java/org/apache/hive/common/util/HdfsEncryptionUtilities.java PRE-CREATION 
>   itests/src/test/resources/testconfiguration.properties 3eff7d010923a4e07d5024904f1531ca52473aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ad5c8f8302de2a15b1703161799f71cd81a94475 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java d7a08ecf1c183fe56b5ca41c2c69d413874418bb 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f76ce17711077ceadf23e6b9ed12e6a414 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b69df3871bbcc870af286774aee5269668b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cbc5466261f749fe7b84d7533dc0ff3274b6777f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 82143a64db163da766dcc138231b4d4174603470 
>   ql/src/test/queries/clientpositive/encryption_map_join_select.q PRE-CREATION 
>   ql/src/test/results/clientpositive/encrypted/encryption_map_join_select.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33956/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>