You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jason Michael <jm...@videoegg.com> on 2010/01/28 23:07:54 UTC
UDAF/UDTF question
Hello all,
What would be the best way to write a function that would perform aggregation computations on records in a table and return multiple rows (and possibly columns)? For example, imagine a function called DECILES that computes all the deciles for a given measure and returns them as 10 rows with 2 columns, decile and value. It seems like what I want is some sort of combination of a UDAF and a UDTF. Does such an animal exist in the Hive world?
Jason
Re: UDAF/UDTF question
Posted by Zheng Shao <zs...@gmail.com>.
The easiest way to go is to write a UDAF to return the answer in
array<struct<decile:int, value:double>>.
Then you can do: (note that explode is a predefined UDTF)
SELECT
tmp.key, tmp2.d.decile, tmp2.d.value
FROM
(SELECT key, Decile(value) as deciles
GROUP BY key) tmp
LATERAL VIEW explode(tmp.deciles) tmp2 AS d
Zheng
On Thu, Jan 28, 2010 at 2:07 PM, Jason Michael <jm...@videoegg.com> wrote:
> Hello all,
>
> What would be the best way to write a function that would perform
> aggregation computations on records in a table and return multiple rows (and
> possibly columns)? For example, imagine a function called DECILES that
> computes all the deciles for a given measure and returns them as 10 rows
> with 2 columns, decile and value. It seems like what I want is some sort of
> combination of a UDAF and a UDTF. Does such an animal exist in the Hive
> world?
>
> Jason
--
Yours,
Zheng