You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by John Sichi <js...@fb.com> on 2010/12/24 01:03:56 UTC
Review Request: HIVE-1262
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/
-----------------------------------------------------------
Review request for hive.
Summary
-------
Review by JVS
This addresses bug HIVE-1262.
https://issues.apache.org/jira/browse/HIVE-1262
Diffs
-----
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/192/diff
Testing
-------
Thanks,
John
Re: Review Request: HIVE-1262
Posted by John Sichi <js...@fb.com>.
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java, line 16
> > <https://reviews.apache.org/r/192/diff/1/?file=9229#file9229line16>
> >
> > Here we may be chopping off a UTF-8 multibyte sequence in the middle. Won't that lead to an invalid UTF-8?
> >
> > Also, I don't think you can use StringBuffer.append(byte []) like this. Won't that call the append(Object) method, which will dump the array address?
> >
> > Anyway, couldn't this just return byte [] since that's what we need later on anyway?
> >
> > (Aside: anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
> >
>
> Edward Capriolo wrote:
> All other comments are taken. As for the case of cutting apart a UTF-8, that is ok (I think). We need 128 bits for the algorithm regardless of what encoding the source key is in we only need 128 bits of it.
Yes, as long as you stay with bytes (and don't try to go back to string) then arbitrary truncation should be fine.
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 37
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line37>
> >
> > Shouldn't this be deterministic=true?
> >
>
> Edward Capriolo wrote:
> The AES algorithm can actually produce multiple intermediates that will decode to the same result. The decode might be deterministic but the encode is definitely not. I figured this was safe.
Oh, OK, in that case mark encrypt as deterministic=false, but all others as deterministic=true
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 85
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line85>
> >
> > Shouldn't we throw these fatals rather than just logging?
>
> Edward Capriolo wrote:
> Agreed. What type of Exception should we throw in this case?
HiveException is fine.
- John
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------
On 2010-12-23 16:03:56, John Sichi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
>
> (Updated 2010-12-23 16:03:56)
>
>
> Review request for hive.
>
>
> Summary
> -------
>
> Review by JVS
>
>
> This addresses bug HIVE-1262.
> https://issues.apache.org/jira/browse/HIVE-1262
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/192/diff
>
>
> Testing
> -------
>
>
> Thanks,
>
> John
>
>
Re: Review Request: HIVE-1262
Posted by Edward Capriolo <ed...@gmail.com>.
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 37
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line37>
> >
> > Shouldn't this be deterministic=true?
> >
The AES algorithm can actually produce multiple intermediates that will decode to the same result. The decode might be deterministic but the encode is definitely not. I figured this was safe.
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java, line 85
> > <https://reviews.apache.org/r/192/diff/1/?file=9231#file9231line85>
> >
> > Shouldn't we throw these fatals rather than just logging?
Agreed. What type of Exception should we throw in this case?
> On 2010-12-23 16:04:03, John Sichi wrote:
> > http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java, line 16
> > <https://reviews.apache.org/r/192/diff/1/?file=9229#file9229line16>
> >
> > Here we may be chopping off a UTF-8 multibyte sequence in the middle. Won't that lead to an invalid UTF-8?
> >
> > Also, I don't think you can use StringBuffer.append(byte []) like this. Won't that call the append(Object) method, which will dump the array address?
> >
> > Anyway, couldn't this just return byte [] since that's what we need later on anyway?
> >
> > (Aside: anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
> >
All other comments are taken. As for the case of cutting apart a UTF-8, that is ok (I think). We need 128 bits for the algorithm regardless of what encoding the source key is in we only need 128 bits of it.
- Edward
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------
On 2010-12-23 16:03:56, John Sichi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
>
> (Updated 2010-12-23 16:03:56)
>
>
> Review request for hive.
>
>
> Summary
> -------
>
> Review by JVS
>
>
> This addresses bug HIVE-1262.
> https://issues.apache.org/jira/browse/HIVE-1262
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/192/diff
>
>
> Testing
> -------
>
>
> Thanks,
>
> John
>
>
Re: Review Request: HIVE-1262
Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/192/#review79
-----------------------------------------------------------
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment120>
You're missing the Apache header on this file.
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment122>
This method formats rather than generates, so it should be named accordingly.
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment121>
Do the call to key_string.getBytes only once rather than over and over in this method.
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment123>
You could use System.arraycopy here
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java
<https://reviews.apache.org/r/192/#comment124>
Here we may be chopping off a UTF-8 multibyte sequence in the middle. Won't that lead to an invalid UTF-8?
Also, I don't think you can use StringBuffer.append(byte []) like this. Won't that call the append(Object) method, which will dump the array address?
Anyway, couldn't this just return byte [] since that's what we need later on anyway?
(Aside: anywhere you really do need StringBuffer, you should be using StringBuilder instead.)
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment129>
"computer?"
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment125>
Shouldn't this be deterministic=true?
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment126>
whitespace
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java
<https://reviews.apache.org/r/192/#comment127>
Shouldn't we throw these fatals rather than just logging?
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java
<https://reviews.apache.org/r/192/#comment128>
Same comments as decrypt
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java
<https://reviews.apache.org/r/192/#comment130>
Some useful variants would be UDTF (row-wise) and UDAF (over an entire table or group).
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java
<https://reviews.apache.org/r/192/#comment131>
For performance, initialize this only once (in initialize method) and then reset per row.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java
<https://reviews.apache.org/r/192/#comment132>
See MD5 comment regarding initializing only once.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java
<https://reviews.apache.org/r/192/#comment133>
This code is all the same as MD5, so might as well create a common base class and share it.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q
<https://reviews.apache.org/r/192/#comment134>
For test determinism, add ORDER BY on queries which return multiple rows.
- John
On 2010-12-23 16:03:56, John Sichi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/192/
> -----------------------------------------------------------
>
> (Updated 2010-12-23 16:03:56)
>
>
> Review request for hive.
>
>
> Summary
> -------
>
> Review by JVS
>
>
> This addresses bug HIVE-1262.
> https://issues.apache.org/jira/browse/HIVE-1262
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/common/AesUtils.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesDecrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAesEncrypt.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSha.java PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_aes.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_crc32.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_md5.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/udf_sha.q PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out 1038444
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_aes.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_crc32.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_md5.q.out PRE-CREATION
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/udf_sha.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/192/diff
>
>
> Testing
> -------
>
>
> Thanks,
>
> John
>
>