You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Shushant Arora <sh...@gmail.com> on 2014/09/07 19:39:50 UTC

collect_set does not remove duplicate

While group by, if I do collect_set on some other column , documentation
says it will return Array of that column after removing duplicates, but its
not doing dedup?Is it  expected?

Re: collect_set does not remove duplicate

Posted by Viral Bajaria <vi...@gmail.com>.
It will be helpful if you paste some sample data to repro. I have used
collect_set and it works as documented for me.

Thanks,
Viral



On Sun, Sep 7, 2014 at 10:39 AM, Shushant Arora <sh...@gmail.com>
wrote:

> While group by, if I do collect_set on some other column , documentation
> says it will return Array of that column after removing duplicates, but its
> not doing dedup?Is it  expected?
>