You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/06/04 10:27:08 UTC
[jira] Created: (HIVE-541) Implement UDFs: INSTR and LOCATE
Implement UDFs: INSTR and LOCATE
--------------------------------
Key: HIVE-541
URL: https://issues.apache.org/jira/browse/HIVE-541
Project: Hadoop Hive
Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Zheng Shao
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731858#action_12731858 ]
Min Zhou commented on HIVE-541:
-------------------------------
all test cases passed on my side, how's yours?
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch, HIVE-541.2.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731764#action_12731764 ]
Min Zhou commented on HIVE-541:
-------------------------------
hmm, It's may be a good way. I will try it soon.
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731363#action_12731363 ]
Min Zhou commented on HIVE-541:
-------------------------------
Text.find(String) would not be faster , string argument will be encoded internally in Text, equivalent cost of Text.toString() which will decode a text.
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Zheng Shao
> Attachments: HIVE-541.1.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Min Zhou updated HIVE-541:
--------------------------
Attachment: HIVE-541.2.patch
Added a GenericUDFUtils.findText() where string encoding and decoding is avoided, faster execution will be gained.
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch, HIVE-541.2.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Yuntao Jia (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731582#action_12731582 ]
Yuntao Jia commented on HIVE-541:
---------------------------------
The patch now uses "String.indexOf(String)" to find the position of a text inside of another text. What about writing our own function like
int find(Text text, Text subtext);
It does not requires converting Texts to Strings any more. Would it be even faster?
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashish Thusoo updated HIVE-541:
-------------------------------
Assignee: Min Zhou (was: Zheng Shao)
Assigning to Min as he has submitted the patch.
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Min Zhou updated HIVE-541:
--------------------------
Attachment: HIVE-541.1.patch
patch
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Zheng Shao
> Attachments: HIVE-541.1.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain resolved HIVE-541.
-----------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed. Thanks Min
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch, HIVE-541.2.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731812#action_12731812 ]
Namit Jain commented on HIVE-541:
---------------------------------
+1
The changes looks good - will merge if the tests pass
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Attachments: HIVE-541.1.patch, HIVE-541.2.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-541:
--------------------------------
Fix Version/s: 0.4.0
Affects Version/s: (was: 0.4.0)
Component/s: UDF
> Implement UDFs: INSTR and LOCATE
> --------------------------------
>
> Key: HIVE-541
> URL: https://issues.apache.org/jira/browse/HIVE-541
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: UDF
> Reporter: Zheng Shao
> Assignee: Min Zhou
> Fix For: 0.4.0
>
> Attachments: HIVE-541.1.patch, HIVE-541.2.patch
>
>
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
> These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.