You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "John Sichi (JIRA)" <ji...@apache.org> on 2010/07/02 17:43:50 UTC

[jira] Commented: (HIVE-1395) Table aliases are ambiguous

    [ https://issues.apache.org/jira/browse/HIVE-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884724#action_12884724 ] 

John Sichi commented on HIVE-1395:
----------------------------------

Actually, after thinking about it some more, it's not practical to prevent alias reuse, even in strict mode.  Here's why.

Suppose I have

CREATE VIEW V AS SELECT * FROM BLAH T1 JOIN FLUB T2 ON T1.J=T2.K;

SELECT * FROM V T1 WHERE T1.X=3;

When we expand the view reference in the query, we'll end up with

SELECT * FROM (
    SELECT * FROM BLAH T1 JOIN FLUB T2 ON T1.J=T2.K
) V T1 WHERE T1.X=3;

And now in the expansion, T1 is legitimately duplicated, even though the person querying the view didn't even know that T1 was used inside the view definition (in general, could be very deep).

Expanding the view in this way is what allows us to do a lot of optimizations such as pushing predicates (e.g. T1.X=3) all the way down into the view. 


> Table aliases are ambiguous
> ---------------------------
>
>                 Key: HIVE-1395
>                 URL: https://issues.apache.org/jira/browse/HIVE-1395
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Adam Kramer
>             Fix For: 0.6.0, 0.7.0
>
>
> Consider this query:
> SELECT a.num FROM (
>   SELECT a.num AS num, b.num AS num2
>   FROM foo a LEFT OUTER JOIN bar b ON a.num=b.num
> ) a
> WHERE a.num2 IS NULL;
> ...in this case, the table alias 'a' is ambiguous. It could be the outer table (i.e., the subquery result), or it could be the inner table (foo).
> In the above case, Hive silently parses the outer reference to a as the inner reference. The result, then, is akin to:
> SELECT foo.num FROM foo WHERE bar.num IS NULL. This is bad.
> The bigger problem, however, is that Hive even lets people use the same table alias at multiple points in the query. We should simply throw an exception during the parse stage if there is any ambiguity in which table is which, just like we do if the column names are ambiguous.
> Or, if for some reason we need people to be able to use 'a' to refer to multiple tables or subqueries, it would be excellent if the exact parsing structure were made clear and added to the wiki. In that case, I will file a separate bug JIRA to complain about how it should be different. :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.