You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/01/01 02:51:00 UTC

[jira] [Created] (IMPALA-9271) Provide UDF framework corresponding to Hive's GenericUDF

Quanlong Huang created IMPALA-9271:
--------------------------------------

             Summary: Provide UDF framework corresponding to Hive's GenericUDF
                 Key: IMPALA-9271
                 URL: https://issues.apache.org/jira/browse/IMPALA-9271
             Project: IMPALA
          Issue Type: New Feature
            Reporter: Quanlong Huang


Hive GenericUDF are superior to normal UDFs in the following ways:
 # It can accept arguments of complex types, and return complex types.
 # It can accept variable length of arguments.
 # It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts array<int>, array<array<int>> and so on (arbitrary levels of nesting).
 # It can do short-circuit evaluations using DeferedObject. Arguments can in any types and it's allowed to do lazy-evaluation for them.

The masking functions added for Ranger column masking are some important examples of GenericUDF. For instance, there're hundreds of ways to use {{mask_show_first_n}}:
{code:java}
   mask_show_first_n(val)
   mask_show_first_n(val, 8)
   mask_show_first_n(val, 8, 'X', 'x', 'n')
   mask_show_first_n(val, 8, 'x', 'x', 'x', 'x', -1)
   mask_show_first_n(val, 8, 'x', -1, 'x', 'x', '9')
   ...{code}
We have to implement hundreds of overloads for all possible combinations.

Currently we don't support complex types in UDF arguments or return type, so we should at least provide a framework to support UDFs that:
 # It can accept variable length of arguments.
 # Arguments can in any types. Their actual values are extracted in the UDF (lazy-evaluation).

For 2, maybe just adding a field in {{impala_udf::AnyVal}} reflecting the actual types is enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)