You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Ravi Teja Chilukuri (JIRA)" <ji...@apache.org> on 2017/04/10 11:34:41 UTC

[jira] [Created] (HIVE-16414) [Hive on Tez] Union queries resources efficiency less on Tez than Mapreduce

Ravi Teja Chilukuri created HIVE-16414:
------------------------------------------

             Summary: [Hive on Tez] Union queries resources efficiency less on Tez than Mapreduce
                 Key: HIVE-16414
                 URL: https://issues.apache.org/jira/browse/HIVE-16414
             Project: Hive
          Issue Type: Bug
          Components: Tez
    Affects Versions: 2.1.0
            Reporter: Ravi Teja Chilukuri


When a hive union query with the sub queries reading the same table is run in Mapreduce and tez, Mapreduce reads the table only once, no matter how many reads on the same table are present,
but tez reads the same table multiple times in the form of multiple vertices.

If a table is to be read by X mappers,
Tez runs with kX map tasks where k is the number of sub queries reading from the same table and 
Mapreduce runs with X mappers no matter how many sub queries are present.


For such union queries, we need to fall back to MR instead of TEZ.


*Query:*
http://pastebin.com/t6n91u6a

*Tez explain plan:*
http://pastebin.com/aWwVxhii

*MR explain plan:*
http://pastebin.com/iDbWwtKR




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)