You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rui Li (JIRA)" <ji...@apache.org> on 2014/10/20 10:26:34 UTC

[jira] [Created] (HIVE-8518) Compile time skew join optimization returns duplicated results

Rui Li created HIVE-8518:
----------------------------

             Summary: Compile time skew join optimization returns duplicated results
                 Key: HIVE-8518
                 URL: https://issues.apache.org/jira/browse/HIVE-8518
             Project: Hive
          Issue Type: Bug
            Reporter: Rui Li


Compile time skew join optimization clones the join operator tree and unions the results.
The problem here is that we don't properly insert the predicate for the cloned join (relying on an assert statement).

To reproduce the issue, run the simple query:
{code}select * from tbl1 join tbl2 on tbl1.key=tbl2.key;{code}
And suppose there's some skew in tbl1 (specify skew with CREATE or ALTER statement).
Duplicated results will be returned if you set hive.optimize.skewjoin.compiletime=true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)