You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2008/09/04 18:41:44 UTC

[jira] Created: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

[Hive] Provide a mechanism for registering UDFs from the query language
-----------------------------------------------------------------------

                 Key: HADOOP-4070
                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
             Project: Hadoop Core
          Issue Type: New Feature
          Components: contrib/hive
            Reporter: Tom White


UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632308#action_12632308 ] 

Ashish Thusoo commented on HADOOP-4070:
---------------------------------------

+1

Looks good.

As Zheng points out, this will conflict with my changes, but we can work that out at commit time.


> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630556#action_12630556 ] 

Tom White commented on HADOOP-4070:
-----------------------------------

Thanks for all your feedback. I'm working on a test case.

The reason I called it AdminWork was because I was following the terminology used in MySQL. "DDLWork" and "DDLFunctionsWork" don't sound right as function creation is not really a part of DDL. I can rename it something more specific - CreateFunctionWork?

I agree that it would be good to make the mapping persistent, and add drop and rename, but these can be done in later Jiras.

{quote}
The reason for that code was that user could specify the list of auxiliary jars in hive-default.xml (and also by setting the hive.aux.jars.path in hive command line client as a session variable). In which case GenericOptionsParser code doesn't work. I think appending 'file://' is new to 19.
{quote}

Not sure about that, but HADOOP-3743 certainly changed things in that area. The problem is that if you set the variable HIVE_AUX_JARS_PATH to say /foo then the session variable hive.aux.jars.path is set to /foo by the hive script, rather than file:///foo. So I've been overriding this by redefining the session variable at the moment. What's the best way of fixing this?

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HADOOP-4070:
----------------------------------

    Status: Patch Available  (was: Open)

resolved conflicts with my explain plan transaction.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070-v4.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632567#action_12632567 ] 

Hadoop QA commented on HADOOP-4070:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12390402/hadoop-4070-v4.patch
  against trunk revision 696846.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3315/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3315/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3315/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3315/console

This message is automatically generated.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070-v4.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632057#action_12632057 ] 

Zheng Shao commented on HADOOP-4070:
------------------------------------

+1

This diff probably conflicts with Ashish's diff on explain (which merged UDFRegistry and UDAFRegistry). We should find a way to merge them.


> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630595#action_12630595 ] 

Prasad Chakka commented on HADOOP-4070:
---------------------------------------

You are correct that we do need to preppend file:// to files specified in session parameter hive.aux.jars.path. But what I meant is that instead of removing the code snippet in ExecDriver, we should mimic what GenericCommandOptionParser does (possibly refactoring the code out). This would be proper if user keeps changing this variable between queries. Another option would be to mandate specification of file:// in the hive.aux.jars.path param itself. But i prefer the former.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4070:
------------------------------

    Attachment: hadoop-4070-v2.patch

Here's an updated patch that addresses some of the comments.

* I've reverted the change to ExecDriver. (Set hive.aux.jars.path as a session variable to pick up jars.)
* AdminWork is now FunctionWork. Other descriptors can be added to this class as needed.
* The syntax is now 
{noformat}
CREATE TEMPORARY FUNCTION my_udf AS 'com.example.MyUdf';
{noformat}
* I've added a client unit test which uses a test UDF packaged in a jar.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4070:
------------------------------

    Attachment: hadoop-4070.patch

This patch extends the grammar to support creating UDFs.

For example you can do
{noformat}
CREATE FUNCTION my_udf AS com.example.MyUdf;
{noformat}
then use the my_udf function just like any other function.

The class defining the UDF (com.example.MyUdf) should be packed in a jar file then made available to hive by setting the HIVE_AUX_JARS_PATH environment variable (which specifies the directory that the jar is in).

Note that there is a change in ExecDriver which removes some code that was setting the "tmpjars" property incorrectly. The problem was that the paths were not have the "file://" scheme prepended to them as GenericOptionsParser does, which caused the paths to be interpreted as HDFS paths. By removing the code GenericOptionsParser sets the variable correctly.

I have tested it manually on a simple UDF, the next step is to write some automated tests.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4070:
------------------------------

    Attachment: hadoop-4070-v3.patch

New patch which includes Ashish's suggestion to reduce duplication in the build files.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HADOOP-4070:
----------------------------------

    Status: Open  (was: Patch Available)

Cancelling as this is going to not compile now since Dhrubha has already checked in my txn of explain plan.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630631#action_12630631 ] 

Ashish Thusoo commented on HADOOP-4070:
---------------------------------------

Not exactly true about treating create/rename/alter functions as part of the DDL. Most DBs bin them in that bucket. Case to point is the mysql doc (link below) which clearly lists create function in the DDL section.

http://dev.mysql.com/doc/refman/5.0/en/sql-syntax-data-definition.html

I am not too hung up on this though. FunctionWork(instead of CreateFunctionWork)  is also fine with me - though I would like to have that bin all the different descriptors for manipulating functions rather than create a different work class for each manipulation (create/alter/rename/drop etc...)

We can start of this with being non persistent but then in the syntax it should be made explicit that this is non persistent.

One proposal could be

CREATE TEMPORARY FUNCTION blah AS blah

Also for temporary functions, reflection is also another way to structure this as Prasad pointed out but then that would not work if we allow non Java functions, so I think having syntax is useful, though we could optimize the java case to the point that all it means to create a new function is to drop in a jar which implements the udf interfaces (including something like a displayName() interface which would allow the language name to be different from the implementation name.



> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630344#action_12630344 ] 

Ashish Thusoo commented on HADOOP-4070:
---------------------------------------

cool...
Actually, could this be made a part of DDLWork instead of AdminWork and then it can all be launched from the DDLTask. We have different DDL descriptors in DDL Work for things like create/drop/alter tables and the functions could also belong to that. Having a DDLFunctionsWork is also fine. I do think we should be more specific than admin.

Also there is this whole thing of making this mapping persistent(both Raghu and Prasad were mentioning that) so that they can be reused across sessions. Maybe we can have that discussion here as well.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4070:
------------------------------

    Assignee: Tom White
      Status: Patch Available  (was: Open)

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632312#action_12632312 ] 

Ashish Thusoo commented on HADOOP-4070:
---------------------------------------

The conflict will be on line 45 in FunctionTask.java - instead of UDFRegistry we have FunctionRegistry now.

We can fix explain plan for this later...

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HADOOP-4070:
----------------------------------

    Attachment: hadoop-4070-v4.patch

Resolved conflict that Tom's patch had with explain plan and added an explain plan test.

Tom I hope you don't mind me doing this as otherwise I think this may not make it for tomorrows 0.19 code freeze

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070-v4.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632050#action_12632050 ] 

Ashish Thusoo commented on HADOOP-4070:
---------------------------------------

+1

Looks good. Thanks for the contribution.

Only comment that I have is that in the patch you have duplicated the code for compile-test in ql/build.xml

Would be better if instead we introduced a new target for building the udf jar (on similar lines as gen-test) and have a dummy gen-test-jar in build-common.xml which gets overwritten in ql/build.xml...

Its upto you if you want to incorporate that... But this is good as is...

Thanks again for the contribution... keep them rolling :)

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630343#action_12630343 ] 

Prasad Chakka commented on HADOOP-4070:
---------------------------------------

Hi Tom,

The reason for that code was that user could specify the list of auxiliary jars in hive-default.xml (and also by setting the hive.aux.jars.path in hive command line client as a session variable). In which case GenericOptionsParser code doesn't work. I think appending 'file://' is new to 19.  

Can you persist this mapping in metastore? otherwise create needs to be specified everytime hive session is brought up. And also management functions such as drop, rename and some such needs to be provided.

Another alternative (as an initial implementation which is simple but sufficient for now) is to use reflection to load the function class. The function name and class name can be same and the package could be predefined (org.apache.hadoop.hive.udf) or configured and at runtime this class would be loaded.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4070:
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.19.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I've just committed this.

The test failures are unrelated to this patch. Thanks Ashish for fixing the conflict with the explain plan patch.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070-v4.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632096#action_12632096 ] 

Hadoop QA commented on HADOOP-4070:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12390274/hadoop-4070-v2.patch
  against trunk revision 696551.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3294/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3294/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3294/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3294/console

This message is automatically generated.

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-4070-v2.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630335#action_12630335 ] 

Zheng Shao commented on HADOOP-4070:
------------------------------------

Can you add a test case, Tom?

For adding a test case, please add a "my.q" file in hive/ql/src/test/queries/positive, and then run "ant test -Dtestcase=TestParse -Dqfile=my.q", collect the standard result by searching for the failed "diff" command in the test log.
And then do the same for hive/ql/src/test/queries/clientpositive, and "ant test -Dtestcase=TestCliDriver -Dqfile=my.q".

Another thing is that do you think AdminWork will be responsible for other works besides creating UDF? if not then let's call it a more specific name (like CreateUDFWork).


> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>         Attachments: hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4070) [Hive] Provide a mechanism for registering UDFs from the query language

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633327#action_12633327 ] 

Hudson commented on HADOOP-4070:
--------------------------------

Integrated in Hadoop-trunk #611 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/])

> [Hive] Provide a mechanism for registering UDFs from the query language
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4070
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4070-v2.patch, hadoop-4070-v3.patch, hadoop-4070-v4.patch, hadoop-4070.patch
>
>
> UDFs (user defined functions) in Hive are currently actually built-in functions. This issue is to develop a packaging and registration mechanism so users can add their own functions at runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.