You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2014/09/16 23:45:34 UTC

[jira] [Commented] (HIVE-8043) Support merging small files [Spark Branch]

    [ https://issues.apache.org/jira/browse/HIVE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136300#comment-14136300 ] 

Xuefu Zhang commented on HIVE-8043:
-----------------------------------

[~lirui] Current Hive on Spark code borrowed Tez's code dealing with merging small files. It basically falls back to MR's way to do this, and please refer to GenSparkUtils.processFileSinkOperators() for details. I think we can take a look at HIVE-7704 to see if there is anything that we can do similarly. Please do the research and put down your findings. We don't need to implement it right way as it's not critical for our M1.

> Support merging small files [Spark Branch]
> ------------------------------------------
>
>                 Key: HIVE-8043
>                 URL: https://issues.apache.org/jira/browse/HIVE-8043
>             Project: Hive
>          Issue Type: Task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Rui Li
>              Labels: Spark-M1
>
> Hive currently supports merging small files with MR as the execution engine. There are options available for this, such as 
> {code}
> hive.merge.mapfiles
> hive.merge.mapredfiles
> {code}
> Hive.merge.sparkfiles is already introduced in HIVE-7810. To make it work, we might need a little more research and design on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)