You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2011/02/09 23:13:57 UTC

[jira] Commented: (HIVE-1307) More generic and efficient merge method

    [ https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992741#comment-12992741 ] 

Ning Zhang commented on HIVE-1307:
----------------------------------

Sorry I missed you comment Ted. Yes this is a bug and it was fixed in current trunk (0.7-SNAPSHOT). 

> More generic and efficient merge method
> ---------------------------------------
>
>                 Key: HIVE-1307
>                 URL: https://issues.apache.org/jira/browse/HIVE-1307
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch, HIVE-1307.3_java.patch, HIVE-1307.4.patch, HIVE-1307.5.patch, HIVE-1307.6.patch, HIVE-1307.7.patch, HIVE-1307.8.patch, HIVE-1307.9.patch, HIVE-1307.patch, HIVE-1307_2_branch_0.6.patch, HIVE-1307_branch_0.6.patch, HIVE-1307_java_only.patch
>
>
> Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is create to read the input files and output to one reducer for merging. This MR job is created at compile time and one MR job for one partition. In the case of dynamic partition case, multiple partitions could be created at execution time and generating merging MR job at compile time is impossible. 
> We should generalize the merge framework to allow multiple partitions and most of the time a map-only job should be sufficient if we use CombineHiveInputFormat. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira