You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2010/01/13 07:35:54 UTC

[jira] Commented: (HIVE-1047) Merge tasks in GenMRUnion1

    [ https://issues.apache.org/jira/browse/HIVE-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799610#action_12799610 ] 

Namit Jain commented on HIVE-1047:
----------------------------------

Looks good - just wanted to make sure that there is a test case which does

union followed by a reduce sink (maybe a group by or something like that).



> Merge tasks in GenMRUnion1
> --------------------------
>
>                 Key: HIVE-1047
>                 URL: https://issues.apache.org/jira/browse/HIVE-1047
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1047.patch
>
>
> In the following query:
> from (select * from src  union all select * from src) s
> insert overwrite table src_multi1 select * where key < 10
> insert overwrite table src_multi2 select * where key > 10 and key < 20;
> There are two topOps (TableScaneOperator) for the same MapRed task. In genTableScan1, each TableScanOperator will create a new task as currTask. The genMRUnion1 should merge two tasks into one. Currently GenMRUnion1 does not merge currTask, this will cause down stream operators like genFileSink1 to  do some hacks to effectively merge the two tasks. A cleaner way is to merge the tasks in GenMRUnion1 as done by join operators etc. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.