You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jan Stöcker <Ja...@q2web.de> on 2010/03/09 11:50:47 UTC

error in semantic analysis of join statement

Hi,

 

I am stuck with what is probably a beginner's mistake, but I simply don't know

what's wrong. I have two tables aux1 and aux2, with each two columns objname

(STRING) and no_null (INT).

I want to find all entries of objname appearing in both tables and gave hive the

following statement:

 

SELECT t1.objname, t1.no_null, t2.no_null, (t1.no_null + t2.no_null) AS null_sum FROM aux1 t1

JOIN aux2 t2 ON (t1.objname = t2.objname) GROUP BY t1.objname SORT BY null_sum LIMIT 30;

 

But I got the error message "Error in semantic analysis: line 1:19 Expression Not In Group By Key t1".

I don't really understand what that means. Anyone can help me?

 

Regards,

Jan

 

 


Re: error in semantic analysis of join statement

Posted by Scott Carey <sc...@richrelevance.com>.
Sure, but the error message is wrong.  It should say the following or something similar to avoid confusion:

"Expression must be in Group By Key or Aggregate function: t1.no_null"

not

"Expression not in Group By Key t1"


For example, the error message for this in PostgreSQL is:
"ERROR:  column "t1.no_null" must appear in the GROUP BY clause or be used in an aggregate function"



On Mar 9, 2010, at 3:05 AM, Sonal Goyal wrote:

Hi,

Let me explain this through an example:

Lets assume your table looks like:

aux1.objname  aux1.no_null
------------------  ----------------
AA                          1
AA                          2
AB                           1
AB                          3

When you do a select aux1.objname, aux1.no_null group by objname, you are grouping the AAs and the ABs together. However, you need an aggregate function over no_null so that you can get the value of no_null corresponding to the groups. When you use a group by over a column, other columns that you select either need to be grouped, or aggregated in some form.

This is what is missing in your query.
HTH.

Thanks and Regards,
Sonal


On Tue, Mar 9, 2010 at 4:20 PM, Jan Stöcker <Ja...@q2web.de>> wrote:
Hi,

I am stuck with what is probably a beginner’s mistake, but I simply don’t know
what’s wrong. I have two tables aux1 and aux2, with each two columns objname
(STRING) and no_null (INT).
I want to find all entries of objname appearing in both tables and gave hive the
following statement:

SELECT t1.objname, t1.no_null, t2.no_null, (t1.no_null + t2.no_null) AS null_sum FROM aux1 t1
JOIN aux2 t2 ON (t1.objname = t2.objname) GROUP BY t1.objname SORT BY null_sum LIMIT 30;

But I got the error message “Error in semantic analysis: line 1:19 Expression Not In Group By Key t1”.
I don’t really understand what that means. Anyone can help me?

Regards,
Jan





AW: error in semantic analysis of join statement

Posted by Jan Stöcker <Ja...@q2web.de>.
Hi,

 

o.k., that makes sense. Actually, now I see that my mistake was the GROUP BY clause, which was not necessary, as the objnames were already unique.

Thanks a lot,

Jan

 

Von: Sonal Goyal [mailto:sonalgoyal4@gmail.com] 
Gesendet: Dienstag, 9. März 2010 12:06
An: hive-user@hadoop.apache.org
Betreff: Re: error in semantic analysis of join statement

 

Hi,

Let me explain this through an example:

Lets assume your table looks like:

aux1.objname  aux1.no_null
------------------  ----------------
AA                          1
AA                          2
AB                           1
AB                          3

When you do a select aux1.objname, aux1.no_null group by objname, you are grouping the AAs and the ABs together. However, you need an aggregate function over no_null so that you can get the value of no_null corresponding to the groups. When you use a group by over a column, other columns that you select either need to be grouped, or aggregated in some form. 

This is what is missing in your query.  
HTH.

Thanks and Regards,
Sonal



On Tue, Mar 9, 2010 at 4:20 PM, Jan Stöcker <Ja...@q2web.de> wrote:

Hi,

 

I am stuck with what is probably a beginner's mistake, but I simply don't know

what's wrong. I have two tables aux1 and aux2, with each two columns objname

(STRING) and no_null (INT).

I want to find all entries of objname appearing in both tables and gave hive the

following statement:

 

SELECT t1.objname, t1.no_null, t2.no_null, (t1.no_null + t2.no_null) AS null_sum FROM aux1 t1

JOIN aux2 t2 ON (t1.objname = t2.objname) GROUP BY t1.objname SORT BY null_sum LIMIT 30;

 

But I got the error message "Error in semantic analysis: line 1:19 Expression Not In Group By Key t1".

I don't really understand what that means. Anyone can help me?

 

Regards,

Jan

 

 

 


Re: error in semantic analysis of join statement

Posted by Sonal Goyal <so...@gmail.com>.
Hi,

Let me explain this through an example:

Lets assume your table looks like:

aux1.objname  aux1.no_null
------------------  ----------------
AA                          1
AA                          2
AB                           1
AB                          3

When you do a select aux1.objname, aux1.no_null group by objname, you are
grouping the AAs and the ABs together. However, you need an aggregate
function over no_null so that you can get the value of no_null corresponding
to the groups. When you use a group by over a column, other columns that you
select either need to be grouped, or aggregated in some form.

This is what is missing in your query.
HTH.

Thanks and Regards,
Sonal


On Tue, Mar 9, 2010 at 4:20 PM, Jan Stöcker <Ja...@q2web.de> wrote:

>  Hi,
>
>
>
> I am stuck with what is probably a beginner’s mistake, but I simply don’t
> know
>
> what’s wrong. I have two tables aux1 and aux2, with each two columns
> objname
>
> (STRING) and no_null (INT).
>
> I want to find all entries of objname appearing in both tables and gave
> hive the
>
> following statement:
>
>
>
> SELECT t1.objname, t1.no_null, t2.no_null, (t1.no_null + t2.no_null) AS
> null_sum FROM aux1 t1
>
> JOIN aux2 t2 ON (t1.objname = t2.objname) GROUP BY t1.objname SORT BY
> null_sum LIMIT 30;
>
>
>
> But I got the error message “Error in semantic analysis: line 1:19
> Expression Not In Group By Key t1”.
>
> I don’t really understand what that means. Anyone can help me?
>
>
>
> Regards,
>
> Jan
>
>
>
>
>