You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jason Michael <jm...@videoegg.com> on 2010/01/28 23:07:54 UTC

UDAF/UDTF question

Hello all,

What would be the best way to write a function that would perform aggregation computations on records in a table and return multiple rows (and possibly columns)?  For example, imagine a function called DECILES that computes all the deciles for a given measure and returns them as 10 rows with 2 columns, decile and value.  It seems like what I want is some sort of combination of a UDAF and a UDTF.  Does such an animal exist in the Hive world?

Jason

Re: UDAF/UDTF question

Posted by Zheng Shao <zs...@gmail.com>.
The easiest way to go is to write a UDAF to return the answer in
array<struct<decile:int, value:double>>.

Then you can do: (note that explode is a predefined UDTF)

SELECT
  tmp.key, tmp2.d.decile, tmp2.d.value
FROM
  (SELECT key, Decile(value) as deciles
   GROUP BY key) tmp
  LATERAL VIEW explode(tmp.deciles) tmp2 AS d


Zheng

On Thu, Jan 28, 2010 at 2:07 PM, Jason Michael <jm...@videoegg.com> wrote:
> Hello all,
>
> What would be the best way to write a function that would perform
> aggregation computations on records in a table and return multiple rows (and
> possibly columns)?  For example, imagine a function called DECILES that
> computes all the deciles for a given measure and returns them as 10 rows
> with 2 columns, decile and value.  It seems like what I want is some sort of
> combination of a UDAF and a UDTF.  Does such an animal exist in the Hive
> world?
>
> Jason



-- 
Yours,
Zheng