You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/10/28 23:45:34 UTC

[jira] [Created] (HIVE-8639) Convert SMBJoin to MapJoin [Spark Branch]

Szehon Ho created HIVE-8639:
-------------------------------

             Summary: Convert SMBJoin to MapJoin [Spark Branch]
                 Key: HIVE-8639
                 URL: https://issues.apache.org/jira/browse/HIVE-8639
             Project: Hive
          Issue Type: Sub-task
            Reporter: Szehon Ho


HIVE-8202 supports auto-conversion of SMB Join.  However, if the tables are partitioned, there could be a slow down as each mapper would need to get a very small chunk of a partition which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map join.

The task is to research and support the conversion from SMB join to map join for Spark execution engine.  See the equivalent of MapReduce in SortMergeJoinResolver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)