You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2020/06/08 17:18:43 UTC

Hello - and collect_set UDF?

Hey all,

I know it's been a while I was in a job where there was no Drill to be
found, just a bunch of Hive. In that role, I got very used to using the
hive collect_set function. It allowed you to return in a single list, all
the unique values based on an aggregate.   Curious, is there anything like
that in Drill? I remember something like that, and when I query
sys.functions there is collect_list and collect_to_list neither of which
work in this context... (collect_list gave me an array index out of bounds,
and collect_to_list told me the column I was collecting wasn't in the
aggregate)

Any UDFs that will achieve this? Even if you've seen them on git hub etc,
(i.e. not built into Drill)

Thanks!

John Omernik

Re: Hello - and collect_set UDF?

Posted by Charles Givre <cg...@gmail.com>.
Hey John, 
What exactly are you looking for this function to do?
-- C



> On Jun 8, 2020, at 1:18 PM, John Omernik <jo...@omernik.com> wrote:
> 
> Hey all,
> 
> I know it's been a while I was in a job where there was no Drill to be
> found, just a bunch of Hive. In that role, I got very used to using the
> hive collect_set function. It allowed you to return in a single list, all
> the unique values based on an aggregate.   Curious, is there anything like
> that in Drill? I remember something like that, and when I query
> sys.functions there is collect_list and collect_to_list neither of which
> work in this context... (collect_list gave me an array index out of bounds,
> and collect_to_list told me the column I was collecting wasn't in the
> aggregate)
> 
> Any UDFs that will achieve this? Even if you've seen them on git hub etc,
> (i.e. not built into Drill)
> 
> Thanks!
> 
> John Omernik