You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by John Sichi <js...@fb.com> on 2010/12/24 01:03:56 UTC

Review Request: HIVE-1262

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/
-----------------------------------------------------------

Review request for hive.


Summary
-------

Review by JVS


This addresses bug HIVE-1262.
    https://issues.apache.org/jira/browse/HIVE-1262


Diffs
-----

  http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/192/diff


Testing
-------


Thanks,

John


Re: Review Request: HIVE-1262

Posted by John Sichi <js...@fb.com>.

> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java, line 16
> > <https://reviews.apache.org/r/192/diff/1/?file=9229#file9229line16>
> >
> >     Here we may be chopping off a UTF-8 multibyte sequence in the middle.  Won't that lead to an invalid UTF-8?
> >     
> >     Also, I don't think you can use StringBuffer.append(byte []) like this.  Won't that call the append(Object) method, which will dump the array address?
> >     
> >     Anyway, couldn't this just return byte [] since that's what we need later on anyway?
> >     
> >     (Aside:  anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
> >
> 
> Edward  Capriolo wrote:
>     All other comments are taken. As for the case of cutting apart a UTF-8, that is ok (I think). We need 128 bits for the algorithm regardless of what encoding the source key is in we only need 128 bits of it.

Yes, as long as you stay with bytes (and don't try to go back to string) then arbitrary truncation should be fine.


> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 37
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line37>
> >
> >     Shouldn't this be deterministic=true?
> >
> 
> Edward  Capriolo wrote:
>     The AES algorithm can actually produce multiple intermediates that will decode to the same result. The decode might be deterministic but the encode is definitely not. I figured this was safe.

Oh, OK, in that case mark encrypt as deterministic=false, but all others as deterministic=true


> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 85
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line85>
> >
> >     Shouldn't we throw these fatals rather than just logging?
> 
> Edward  Capriolo wrote:
>     Agreed. What type of Exception should we throw in this case?

HiveException is fine.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------


On 2010-12-23 16:03:56, John Sichi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
> 
> (Updated 2010-12-23 16:03:56)
> 
> 
> Review request for hive.
> 
> 
> Summary
> -------
> 
> Review by JVS
> 
> 
> This addresses bug HIVE-1262.
>     https://issues.apache.org/jira/browse/HIVE-1262
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/192/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> John
> 
>


Re: Review Request: HIVE-1262

Posted by Edward Capriolo <ed...@gmail.com>.

> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 37
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line37>
> >
> >     Shouldn't this be deterministic=true?
> >

The AES algorithm can actually produce multiple intermediates that will decode to the same result. The decode might be deterministic but the encode is definitely not. I figured this was safe.


> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 85
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line85>
> >
> >     Shouldn't we throw these fatals rather than just logging?

Agreed. What type of Exception should we throw in this case?


> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java, line 16
> > <https://reviews.apache.org/r/192/diff/1/?file=9229#file9229line16>
> >
> >     Here we may be chopping off a UTF-8 multibyte sequence in the middle.  Won't that lead to an invalid UTF-8?
> >     
> >     Also, I don't think you can use StringBuffer.append(byte []) like this.  Won't that call the append(Object) method, which will dump the array address?
> >     
> >     Anyway, couldn't this just return byte [] since that's what we need later on anyway?
> >     
> >     (Aside:  anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
> >

All other comments are taken. As for the case of cutting apart a UTF-8, that is ok (I think). We need 128 bits for the algorithm regardless of what encoding the source key is in we only need 128 bits of it.


- Edward


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------


On 2010-12-23 16:03:56, John Sichi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
> 
> (Updated 2010-12-23 16:03:56)
> 
> 
> Review request for hive.
> 
> 
> Summary
> -------
> 
> Review by JVS
> 
> 
> This addresses bug HIVE-1262.
>     https://issues.apache.org/jira/browse/HIVE-1262
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/192/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> John
> 
>


Re: Review Request: HIVE-1262

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment120>

    You're missing the Apache header on this file.



http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment122>

    This method formats rather than generates, so it should be named accordingly.



http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment121>

    Do the call to key_string.getBytes only once rather than over and over in this method.



http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment123>

    You could use System.arraycopy here



http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment124>

    Here we may be chopping off a UTF-8 multibyte sequence in the middle.  Won't that lead to an invalid UTF-8?
    
    Also, I don't think you can use StringBuffer.append(byte []) like this.  Won't that call the append(Object) method, which will dump the array address?
    
    Anyway, couldn't this just return byte [] since that's what we need later on anyway?
    
    (Aside:  anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
    



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment129>

    "computer?"



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment125>

    Shouldn't this be deterministic=true?
    



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment126>

    whitespace
    



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment127>

    Shouldn't we throw these fatals rather than just logging?



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java
<https://reviews.apache.org/r/192/#comment128>

    Same comments as decrypt



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java
<https://reviews.apache.org/r/192/#comment130>

    Some useful variants would be UDTF (row-wise) and UDAF (over an entire table or group).



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java
<https://reviews.apache.org/r/192/#comment131>

    For performance, initialize this only once (in initialize method) and then reset per row.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java
<https://reviews.apache.org/r/192/#comment132>

    See MD5 comment regarding initializing only once.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java
<https://reviews.apache.org/r/192/#comment133>

    This code is all the same as MD5, so might as well create a common base class and share it.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q
<https://reviews.apache.org/r/192/#comment134>

    For test determinism, add ORDER BY on queries which return multiple rows.


- John


On 2010-12-23 16:03:56, John Sichi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
> 
> (Updated 2010-12-23 16:03:56)
> 
> 
> Review request for hive.
> 
> 
> Summary
> -------
> 
> Review by JVS
> 
> 
> This addresses bug HIVE-1262.
>     https://issues.apache.org/jira/browse/HIVE-1262
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION 
>   http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/192/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> John
> 
>