You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal Vijayaraghavan (Jira)" <ji...@apache.org> on 2020/06/04 07:57:00 UTC

[jira] [Created] (HIVE-23609) SemiJoin: Relax big table size check for self-joins

Gopal Vijayaraghavan created HIVE-23609:
-------------------------------------------

             Summary: SemiJoin: Relax big table size check for self-joins
                 Key: HIVE-23609
                 URL: https://issues.apache.org/jira/browse/HIVE-23609
             Project: Hive
          Issue Type: Improvement
            Reporter: Gopal Vijayaraghavan


For self-joins, several other heuristics applied to Semijoins don't apply as the difference between rows on either side is likely to result in an actual reduction of rows scanned.

This change results in slightly different Tez priorities for self-joins which are heavily filtered on one side over the other, which helps ensure the smaller table is completed before the bigger table consumes resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)