You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/08/13 10:15:38 UTC

[Hadoop Wiki] Update of "Hive/LanguageManual/UDF" by Ning Zhang

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/UDF" page has been changed by Ning Zhang.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=48&rev2=49

--------------------------------------------------

  ||array<struct `{'x','y'}`>|| histogram_numeric(col, b) || Computes a histogram of a numeric column in the group using b non-uniformly spaced bins. The output is an array of size b of double-valued (x,y) coordinates that represent the bin centers and heights ||
  
  == Built-in Table-Generating Functions (UDTF) ==
- <<Anchor(UDTF)>> Normal user-defined functions, such as concat(), take in a single input row and output a single output row. In contrast, table-generating functions transform a single input row to multiple output rows. Currently, the only table-generating function is explode(), which takes in an array as an input and outputs the elements of the array as separate rows. UDTF's can be used in the SELECT expression list and as a part of LATERAL VIEW.
+ <<Anchor(UDTF)>> Normal user-defined functions, such as concat(), take in a single input row and output a single output row. In contrast, table-generating functions transform a single input row to multiple output rows. 
+ 
+ === explode ===
+ 
+ explode() takes in an array as an input and outputs the elements of the array as separate rows. UDTF's can be used in the SELECT expression list and as a part of LATERAL VIEW.
  
  An example use of explode() in the SELECT expression list is as follows:
  
@@ -291, +295 @@

  ||<10%>Return Type''' ''' ||<10%>Name(Signature)''' ''' ||Description''' ''' ||
  || myType ||explode(array<myType> a) <<Anchor(explode)>> ||For each element in a, explode() generates a row containing that element ||
  
+ === json_tuple ===
+ A new json_tuple() UDTF is introduced in hive 0.7. It takes a set of names (keys) and return a tuple of values in one function.
+ If you are using get_json_object() and want to replace it with json_tuple, the only changes is that your query will be using json_tuple() in lateral view rather than multiple get_json_object() in the select clause. 
+ 
+ For example, 
+ {{{
+ select a.timestamp, get_json_object(a.appevents, '$.eventid'), get_json_object(a.appenvets, '$.eventname') from log a;
+ }}}
+ should be changed to 
+ {{{
+ select a.timestamp, b.*
+ from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2;
+ }}}
+