You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by kumar avijit <av...@oracle.com> on 2012/08/04 00:16:58 UTC
Parsing nested joins in Hive
Hi Hive users,
According to the grammar for Joins at
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins,
Hive should be able to parse joins where the table reference on the
right-hand side is itself a join. For instance, one should be able to
specify (a join (b join c)) as:
a LEFT SEMI JOIN b LEFT SEMI JOIN c ON b.x = c.x ON a.y = b.y
However, I have been unsuccessful in issuing such queries to Hive. I am
using Hive-0.9.0:
hive> select * from orderdetails left semi join orders left semi join
sites on (sites.siteid = orders.siteid) on (orderdetails.orderid =
orders.orderid);
FAILED: Parse Error: line 1:104 mismatched input 'on' expecting EOF near ')'
Explicit parenthesization of the join leads to parse error too --
parenthesis supposedly signals start of a subquery.
hive> select * from orderdetails left semi join ( orders left semi join
sites on (sites.siteid = orders.siteid) ) on
(orderdetails.orderid=orders.orderid);
FAILED: Parse Error: line 1:44 cannot recognize input near 'orders'
'left' 'semi' in subquery source
1. It seems to me that Hive parser currently only supports joins
specified in a left-recursive manner, i.e, rhs limited to a table name.
Is this correct? The manual merely says that all Joins are assumed
left-associative, but I could not find a mention of whether
parenthesization is allowed for nesting.
2. Are there plans to support arbitrarily nested joins in future, if not
currently allowed?
Regards,
Avijit
Re: Parsing nested joins in Hive
Posted by "Tucker, Matt" <Ma...@disney.com>.
Hi,
Try the following statement instead:
Select *
from orderdetails
left semi join (
select orderid
from orders
left semi join sites on
orders.siteid = sites.siteid
) orders on
orderdetails.orderid = orders.orderid;
Matt
On Aug 3, 2012, at 6:17 PM, "kumar avijit" <av...@oracle.com> wrote:
> select * from orderdetails left semi join ( orders left semi join
> sites on (sites.siteid = orders.siteid) ) on
> (orderdetails.orderid=orders.orderid);