You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by tim robertson <ti...@gmail.com> on 2009/07/03 23:11:56 UTC

many terms in group by

Hi all,

I have several MapReduce jobs that are basically doing counts with
group by on tab delimited files.
Getting tired of writing the same thing over again for each report I
am thinking of trying Hive for this.

Does Hive work ok with 9 or so terms in the group by?
(e.g. it is happy concatenating the fields to make the key to emit
from the map so it can do the count is a reduce and complete in one
mapreduce job)

I'm meaning the equivalent of:
  select a,b,c,d,e,f,g,h,i,count(*) from table x group by a,b,c,d,e,f,g,h,i;

Many thanks,
Tim

Re: many terms in group by

Posted by tim robertson <ti...@gmail.com>.

Thanks for the indication it will work.  I'll set it up and try it.

Sorry for posting repeat question... just joined list (again).

Cheers

Tim


On Sat, Jul 4, 2009 at 9:03 AM, Amr Awadallah<aa...@cloudera.com> wrote:
>> select a,b,c,d,e,f,g,h,i,count(*) from table x group by a,b,c,d,e,f,g,h,i;
>
> yes, should work, please try it and let us know.
>
> -- amr
>
> tim robertson wrote:
>>
>> Hi all,
>>
>> I have several MapReduce jobs that are basically doing counts with
>> group by on tab delimited files.
>> Getting tired of writing the same thing over again for each report I
>> am thinking of trying Hive for this.
>>
>> Does Hive work ok with 9 or so terms in the group by?
>> (e.g. it is happy concatenating the fields to make the key to emit
>> from the map so it can do the count is a reduce and complete in one
>> mapreduce job)
>>
>> I'm meaning the equivalent of:
>>  select a,b,c,d,e,f,g,h,i,count(*) from table x group by
>> a,b,c,d,e,f,g,h,i;
>>
>> Many thanks,
>> Tim
>>
>

Re: many terms in group by

Posted by Amr Awadallah <aa...@cloudera.com>.

 > select a,b,c,d,e,f,g,h,i,count(*) from table x group by 
a,b,c,d,e,f,g,h,i;

yes, should work, please try it and let us know.

-- amr

tim robertson wrote:
> Hi all,
>
> I have several MapReduce jobs that are basically doing counts with
> group by on tab delimited files.
> Getting tired of writing the same thing over again for each report I
> am thinking of trying Hive for this.
>
> Does Hive work ok with 9 or so terms in the group by?
> (e.g. it is happy concatenating the fields to make the key to emit
> from the map so it can do the count is a reduce and complete in one
> mapreduce job)
>
> I'm meaning the equivalent of:
>   select a,b,c,d,e,f,g,h,i,count(*) from table x group by a,b,c,d,e,f,g,h,i;
>
> Many thanks,
> Tim
>