You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Fawze Abujaber <fa...@gmail.com> on 2018/07/30 08:21:28 UTC
How to replace collect_set Hive function in Impala?
Hi everyone!
could anybody tell, how I can replace collect_set Hive function in impala?
Query like this:
select
col1,
collect_set(distinct col2)
from dpi_parquet_gzip
group by
col1
thanks a lot!
--
Take Care
Fawze Abujaber
Re: How to replace collect_set Hive function in Impala?
Posted by Greg Rahn <gr...@gmail.com>.
If returning a delimited string works vs an array - see group_concat()
https://impala.apache.org/docs/build/html/topics/impala_group_concat.html
On Mon, Jul 30, 2018 at 3:18 PM Zoltan Borok-Nagy
<bo...@cloudera.com> wrote:
>
> Hi Fawze,
>
> In Impala, only scalar types are allowed in the select list because Impala always produces result sets with all scalar values, i.e. simple tables.
> The collect_set() function in Hive returns an array, and Impala cannot put an array into a single cell of an output table.
>
> If you want to write files that contain complex data I'm afraid you'll need Hive.
>
> BR,
> Zoltan
>
>
>
>
> On Mon, Jul 30, 2018 at 10:21 AM Fawze Abujaber <fa...@gmail.com> wrote:
>>
>> Hi everyone!
>>
>>
>>
>> could anybody tell, how I can replace collect_set Hive function in impala?
>>
>>
>>
>> Query like this:
>>
>>
>>
>> select
>> col1,
>>
>> collect_set(distinct col2)
>>
>> from dpi_parquet_gzip
>> group by
>> col1
>>
>>
>>
>> thanks a lot!
>>
>> --
>> Take Care
>> Fawze Abujaber
Re: How to replace collect_set Hive function in Impala?
Posted by Zoltan Borok-Nagy <bo...@cloudera.com>.
Hi Fawze,
In Impala, only scalar types are allowed in the select list because Impala
always produces result sets with all scalar values, i.e. simple tables.
The collect_set() function in Hive returns an array, and Impala cannot put
an array into a single cell of an output table.
If you want to write files that contain complex data I'm afraid you'll need
Hive.
BR,
Zoltan
On Mon, Jul 30, 2018 at 10:21 AM Fawze Abujaber <fa...@gmail.com> wrote:
> Hi everyone!
>
>
>
> could anybody tell, how I can replace collect_set Hive function in impala?
>
>
>
> Query like this:
>
>
>
> select
> col1,
>
> collect_set(distinct col2)
>
> from dpi_parquet_gzip
> group by
> col1
>
>
>
> thanks a lot!
>
> --
> Take Care
> Fawze Abujaber
>