You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2009/10/04 04:55:23 UTC

[jira] Created: (HIVE-867) Add add UDFs found in mysq

Add add UDFs found in mysq
--------------------------

                 Key: HIVE-867
                 URL: https://issues.apache.org/jira/browse/HIVE-867
             Project: Hadoop Hive
          Issue Type: New Feature
            Reporter: Edward Capriolo
            Assignee: Edward Capriolo


Some UDF's that mysql has that hive does not. 
atan
aes_decrypt
aes_encrypt
bit_and
bit_count
bit_length
bit_or
bit_xor

char_length
char
character_length
collation
compress

crc32
encode
encrypt
format
greatest

in
inet_oton
inet_ntoa
match

md5
oct
ord
pi
radians
sha1 _sha
sign
sleep
truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysq

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764707#action_12764707 ] 

Namit Jain commented on HIVE-867:
---------------------------------

Some minor comments:

1. Can you add a new follow-up jira which contains the list of remaining functions ?
2. Can you add describe and describe extended for all new functions ? You can do them in individual new tests that
     you added or in show_functions.q.
3. When you run all tests, you should see some diffs - for the 'show_functions', you should update the.out files

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774175#action_12774175 ] 

Edward Capriolo commented on HIVE-867:
--------------------------------------

@Zheng, 
I did udfleft & udfright like substring. Wouldnt we want both to work the same way? 

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Attachment: hive-867-7.diff

Please review

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778448#action_12778448 ] 

Namit Jain commented on HIVE-867:
---------------------------------


@Edward, other than Zheng's comments, some more comments:
The tests are running right now, so may have more comments based on the results.

1. udf_least.q.out missing from the patch
2. UDFRadians has annotation at the wrong place, and the test output udf_radians.q.out is wrong for describe and describe extended.
3. System.out.println present in UDFLeast and _aton
4. _ntoa remove the last comment
5. Wrong comments at the beginnning in udf_left.q and udf_left.q.out
6. _aton: Instead of converting the text to String, and then splitting it, can we use some Text API to parse the IP address.





> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778578#action_12778578 ] 

Namit Jain commented on HIVE-867:
---------------------------------

As a policy, we have refrained from committing our own patches - since it may lead to extra changes by mistake.
But, the steps you mentioned are correct - we can go over them offline in detail, but please pick someone else's patch for committing.

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777087#action_12777087 ] 

Namit Jain commented on HIVE-867:
---------------------------------

@Edward, Zheng is on vacation - I will take a look at this soon

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Attachment: hive-867-3.diff

Still adding more functions. Now supporting.

{noformat}
A      ql/src/test/results/clientpositive/udf_left.q.out
A      ql/src/test/results/clientpositive/udf_crc32.q.out
A      ql/src/test/results/clientpositive/udf_sign.q.out
A      ql/src/test/results/clientpositive/udf_radians.q.out
A      ql/src/test/results/clientpositive/udf_right.q.out
A      ql/src/test/results/clientpositive/udf_degrees.q.out
A      ql/src/test/results/clientpositive/udf_PI.q.out
A      ql/src/test/results/clientpositive/udf_sha.q.out
A      ql/src/test/queries/clientpositive/udf_crc32.q
A      ql/src/test/queries/clientpositive/udf_degrees.q
A      ql/src/test/queries/clientpositive/udf_radians.q
A      ql/src/test/queries/clientpositive/udf_sign.q
A      ql/src/test/queries/clientpositive/udf_sha.q
A      ql/src/test/queries/clientpositive/udf_E.q
A      ql/src/test/queries/clientpositive/udf_right.q
A      ql/src/test/queries/clientpositive/udf_md5.q
A      ql/src/test/queries/clientpositive/udf_aes.q
A      ql/src/test/queries/clientpositive/udf_left.q
A      ql/src/test/queries/clientpositive/udf_PI.q
{noformat}

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysq

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767678#action_12767678 ] 

Zheng Shao commented on HIVE-867:
---------------------------------

If you almost always need a String parameter, we can just use "String" as the type of the parameter in the UDF definition.
If you almost always need to return a String, we can also just return "String".

So for UDFLeft and UDFRight, we can do:
{code}
  public String evaluate(String s, IntWritable r);
{code}
instead of
{code}
  public Text evaluate(Text s, IntWritable r);
{code}

This will save a lot of conversions if user do "left(right(col, 10), 3)".

This is the same for the SerDe - for example, RegexSerDe returns "String" instead of "Text", so "left(col, 3)" where col is from a RegexSerDe table does not need a conversion from "String" -> "Text" to pass to the Left function, and then "Text" -> "String" inside the left function.

Of course, the most efficient way is to do the char counting without UTF-8 encoding/decoding, (then we still prefer Text because we don't need to create new objects), but I think we can do that later unless you want to do it now.

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Status: Patch Available  (was: Open)

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829491#action_12829491 ] 

Zheng Shao commented on HIVE-867:
---------------------------------

Hi Edward, there are a bunch of small conflicts with trunk now. Can you regenerate the patch?

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846982#action_12846982 ] 

Edward Capriolo commented on HIVE-867:
--------------------------------------

I am going to break these up into several Issues, encryption, constants, string, etc. This should avoid issues with one UDF holding all these others back.

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767819#action_12767819 ] 

Edward Capriolo commented on HIVE-867:
--------------------------------------

@Zheng,

Thank you for your comments. I would like to do them the optimal way first pass. There is no rush here, and reopening jira issues and re-running tests takes more time then a few rounds of reviews.

Let's do it the most efficient way. If I understand you correctly should be....

{noformat}
 public Text evaluate(Text s, IntWritable r) {
+
+    if (s == null || r == null) {
+      return null;
+    }
+    
+    ////String data = s.toString();  //<-get rid of this
+    
+    if (r.get()>=data.length()){
+     //// result.set(s); //<--get rid of this
       arrayCopyHere(data, result);  //or 
      return s;
 
+    } else {
+     /// result.set( data.substring(0, r.get()) );   
          arrayCopyHere();
+    }
+    
+    return result;
+  }
+}

{noformat}

Is that right? just work with bytes array and the private member is the most efficient mechanism?

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysql

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-867:
----------------------------

    Status: Open  (was: Patch Available)

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysq

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772907#action_12772907 ] 

Zheng Shao commented on HIVE-867:
---------------------------------

@Edward, sorry for the delay on this.
The most efficient way would be to reuse a private member of type Text, and count the UTF-8 characters for left and right.
For details on UTF-8 char counting, see http://en.wikipedia.org/wiki/UTF-8



> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Attachment: hive-867-10.diff

{noformat}
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPI.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDegrees.java
M      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSubstr.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSHA.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAes_encrypt.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAes_decrypt.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AesUtils.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRight.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFInet_ntoa.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFE.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSleep.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFInet_aton.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLeft.java
{noformat}

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Attachment: hive-867-1.diff

Not complete but, someone might want to take a look and make comments.

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysql

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778560#action_12778560 ] 

Edward Capriolo commented on HIVE-867:
--------------------------------------

Namit
@1-5. I will take a look at these things.
I will need to regenerate again anyway as other udfs have been added since this one.
As to #6 so we are assuming encoded is not an issue here.

Namit, I just got my apache id!, so when we get to a +1 status I would like to do the commit on this one, maybe I can contact you offline and we can walk through the steps and any specific things you guys to before commit. Seems like svn update .. patch .. ant test ... update changes.txt svn commit.

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765106#action_12765106 ] 

Edward Capriolo commented on HIVE-867:
--------------------------------------

I can use some advice on something. I have been working on the aes_encrypt and the aes_decrypt functions.

{noformat}
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.security.*;
   import javax.crypto.*;
   import javax.crypto.spec.*;
   import java.io.*;


public class a {

public static void main(String [] args) throws Exception{
     String data_string="Hello";
     String key_string="123";
     byte [] encrypted = null;
     byte [] decrypted = null;
     StringBuffer buffer = new StringBuffer();
      buffer.append(key_string);
      for (int i=key_string.getBytes("UTF-8").length;i<16;i++){
        buffer.append('\0');
      }
      Cipher cipher = Cipher.getInstance("AES");
      SecretKeySpec skeySpec = new SecretKeySpec(buffer.toString().getBytes("UTF-8"), "A
ES");
      
      cipher.init(Cipher.ENCRYPT_MODE, skeySpec);
      encrypted =cipher.doFinal(data_string.getBytes() );
       System.out.println("en "+encrypted);

      cipher.init(Cipher.DECRYPT_MODE, skeySpec);
      decrypted =cipher.doFinal(encrypted );
      System.out.println("de "+decrypted);
      System.out.println("de2 "+new String(decrypted) );
}
}
{noformat}

I have this working in a stand alone program but I am having some issues getting it to work as a udf.

{noformat}
select aes_decrypt(aes_encrypt('yo','123'), '123') FROM src LIMIT 1;
{noformat}

Should return 'yo'. I added some debug
{noformat}
  [junit] plan = /tmp/plan9309.xml
    [junit] en [B@1b22920
    [junit] en l16
    [junit] en [B@1aa2c23
    [junit] en l16
    [junit] Classclass java.lang.String
    [junit] en [B@1700391
    [junit] en l16
    [junit] len23
    [junit] data_string�        s�˴�v�-�
    [junit] key123
{noformat}

As you can see from my debug the string seems to be "changing length" between aes_encrypt and aes_decrypt. 16->23. Is this a serialization conversion thing?

Can I return byte[] rather then string? Any ideas would be VERY appreciated!


> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysql

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-867:
----------------------------

    Summary: Add add UDFs found in mysql  (was: Add add UDFs found in mysq)

> Add add UDFs found in mysql
> ---------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-867) Add add UDFs found in mysq

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-867:
---------------------------------

    Attachment: hive-867-2.diff

Thus far added 

{noformat}
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPI.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDegrees.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSHA.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCrc32.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMD5.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRight.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java
A      ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLeft.java
{noformat}

> Add add UDFs found in mysq
> --------------------------
>
>                 Key: HIVE-867
>                 URL: https://issues.apache.org/jira/browse/HIVE-867
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: hive-867-1.diff, hive-867-2.diff
>
>
> Some UDF's that mysql has that hive does not. 
> atan
> aes_decrypt
> aes_encrypt
> bit_and
> bit_count
> bit_length
> bit_or
> bit_xor
> char_length
> char
> character_length
> collation
> compress
> crc32
> encode
> encrypt
> format
> greatest
> in
> inet_oton
> inet_ntoa
> match
> md5
> oct
> ord
> pi
> radians
> sha1 _sha
> sign
> sleep
> truncate

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.