You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eric Hanson (JIRA)" <ji...@apache.org> on 2013/08/24 02:13:51 UTC

[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode

    [ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749189#comment-13749189 ] 

Eric Hanson commented on HIVE-4961:
-----------------------------------

Completed working version of bridge to allow custom UDFs that are subclasses
of UDF to work in vectorized mode. This supports UDFs with evaluate() methods
that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable)
and standard types (e.g. long). Generic UDFs are not supported. That will be the 
subject of a future patch.

I did manual testing for a large set of UDFs taking and returning the types supported
by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp.

UDFs one argument and multiple arguments were tested. Both constant and variable arguments
were tested.

Including the tests with the patch, or doing another patch with end-to-end tests, is yet to be done.
                
> Create bridge for custom UDFs to operate in vectorized mode
> -----------------------------------------------------------
>
>                 Key: HIVE-4961
>                 URL: https://issues.apache.org/jira/browse/HIVE-4961
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>            Assignee: Eric Hanson
>         Attachments: vectorUDF.4.patch, vectorUDF.5.patch
>
>
> Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode.
> This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs.
> I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira