You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by James Pirz <ja...@gmail.com> on 2015/07/30 02:03:17 UTC

Aggregated Expression not in GROUP BY key

Hi,

I am using Hive 1.2, and I am trying to run some queries based on TPCH
schema. My original query is:

SELECT N_NAME, AVERAGE(C_ACCTBAL)
FROM customer JOIN nation
on C_NATIONKEY=N_NATIONKEY
GROUP BY N_NAME;

for which I get:
FAILED: SemanticException [Error 10025]: Line 1:15 Expression not in GROUP
BY key 'C_ACCTBAL'

It does not really make sense, as I am running an aggregation on an
attribute which is not part of the group-by clause, which makes sure that
each group eventually gets one single value for the output. In Hive's
language manual we see that:
( https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy )

" … When using group by clause, the select statement can only include
columns included in the group by clause. Of course, you can have as many
aggregation functions (e.g. count) in the select statement as well."

and the example there is similar to what I have.

I even simplified the query, and dropped the join, but it did not make a
difference:

SELECT C_NATIONKEY, AVERAGE(C_ACCTBAL)
FROM customer
GROUP BY C_NATIONKEY;

FAILED: SemanticException [Error 10025]: Line 1:20 Expression not in GROUP
BY key 'C_ACCTBAL'

Can you please let me know if I am missing something here and this behavior
is expected or not ?

In case you need it, the schema for the tables looks like:

hive> describe customer;
OK
c_custkey           int
c_name               string
c_address           string
c_phone             string
c_acctbal           double
c_mktsegment         string
c_comment           string
c_nationkey         int

hive> describe nation;
OK
n_nationkey         int
n_name               string
n_regionkey         int
n_comment           string

Thanks.

Re: Aggregated Expression not in GROUP BY key

Posted by James Pirz <ja...@gmail.com>.
Just a follow-up on the issue.
It was really happening because of using AVERAGE() instead of AVG().
Sorry, but the error was mis-leading (It did not tell me that function name
is invalid).
I had borrowed the query from a benchmark spec, and they had used AVERAGE
in their sql statements, and I failed to fix it for HiveQL.



On Wed, Jul 29, 2015 at 5:03 PM, James Pirz <ja...@gmail.com> wrote:

> Hi,
>
> I am using Hive 1.2, and I am trying to run some queries based on TPCH
> schema. My original query is:
>
> SELECT N_NAME, AVERAGE(C_ACCTBAL)
> FROM customer JOIN nation
> on C_NATIONKEY=N_NATIONKEY
> GROUP BY N_NAME;
>
> for which I get:
> FAILED: SemanticException [Error 10025]: Line 1:15 Expression not in GROUP
> BY key 'C_ACCTBAL'
>
> It does not really make sense, as I am running an aggregation on an
> attribute which is not part of the group-by clause, which makes sure that
> each group eventually gets one single value for the output. In Hive's
> language manual we see that:
> ( https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy
> )
>
> " … When using group by clause, the select statement can only include
> columns included in the group by clause. Of course, you can have as many
> aggregation functions (e.g. count) in the select statement as well."
>
> and the example there is similar to what I have.
>
> I even simplified the query, and dropped the join, but it did not make a
> difference:
>
> SELECT C_NATIONKEY, AVERAGE(C_ACCTBAL)
> FROM customer
> GROUP BY C_NATIONKEY;
>
> FAILED: SemanticException [Error 10025]: Line 1:20 Expression not in GROUP
> BY key 'C_ACCTBAL'
>
> Can you please let me know if I am missing something here and this
> behavior is expected or not ?
>
> In case you need it, the schema for the tables looks like:
>
> hive> describe customer;
> OK
> c_custkey           int
> c_name               string
> c_address           string
> c_phone             string
> c_acctbal           double
> c_mktsegment         string
> c_comment           string
> c_nationkey         int
>
> hive> describe nation;
> OK
> n_nationkey         int
> n_name               string
> n_regionkey         int
> n_comment           string
>
> Thanks.
>