You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sukhendu Chakraborty <su...@gmail.com> on 2013/12/04 02:45:11 UTC

map join in subqueries

Hi,

Is there anyway mapjoin works on the subquery(not the underlying table). I
have the following query:

select external_id,count(category_id) from
catalog_products_in_categories_orc pc inner join (select * from
catalog_products_orc where s_id=118) p on pc.product_id=p.id   group by
external_id;


Now, even though catalog_products_orc is a big table, after filtering
(s_id=118) it results in very few number of rows which can be easily
optimized to a mapjoin (with catalog_products_in_categories_orc as the big
table and the subquery result as the small table) . However, when I try to
specify /*+MAPJOIN(p)*/ to enforce this, it results in a mapjoin for the
table catalog_products_orc (and not on the subquery after filtering).

Any ideas to achieve mapjoin on a subquery (and not the underlying table)?


-Sukhendu

Re: map join in subqueries

Posted by Navis류승우 <na...@nexr.com>.
What version are you using? After 0.11.0, mapjoin hint is ignored by
default.

use,

set hive.ignore.mapjoin.hint=false;

if you want to mapjoin hint applied.


2013/12/4 Sukhendu Chakraborty <su...@gmail.com>

> Hi,
>
> Is there anyway mapjoin works on the subquery(not the underlying table). I
> have the following query:
>
> select external_id,count(category_id) from
> catalog_products_in_categories_orc pc inner join (select * from
> catalog_products_orc where s_id=118) p on pc.product_id=p.id   group by
> external_id;
>
>
> Now, even though catalog_products_orc is a big table, after filtering
> (s_id=118) it results in very few number of rows which can be easily
> optimized to a mapjoin (with catalog_products_in_categories_orc as the big
> table and the subquery result as the small table) . However, when I try to
> specify /*+MAPJOIN(p)*/ to enforce this, it results in a mapjoin for the
> table catalog_products_orc (and not on the subquery after filtering).
>
> Any ideas to achieve mapjoin on a subquery (and not the underlying table)?
>
>
> -Sukhendu
>