You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2010/08/24 00:00:21 UTC
[jira] Created: (HIVE-1585) Customizable merge output size
Customizable merge output size
------------------------------
Key: HIVE-1585
URL: https://issues.apache.org/jira/browse/HIVE-1585
Project: Hadoop Hive
Issue Type: Improvement
Reporter: Ning Zhang
Currently if hive.merge.[mapfiles|mapredfiles] is true and the merged output file size is determined by the input split size which is determined by mapred.min.split.size, mapred.min.split.size.per.[node|rack] and mapred.max.split.size. Sometimes it is desirable to have different output file size than the input split size.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1585) Customizable merge output size
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901638#action_12901638 ]
Namit Jain commented on HIVE-1585:
----------------------------------
<property>
<name>hive.merge.size.per.task</name>
<value>256000000</value>
<description>Size of merged files at the end of the job</description>
</property>
<property>
<name>hive.merge.size.smallfiles.avgsize</name>
<value>16000000</value>
<description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
</property>
Don't the above parameters meet your criteria ?
> Customizable merge output size
> ------------------------------
>
> Key: HIVE-1585
> URL: https://issues.apache.org/jira/browse/HIVE-1585
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Ning Zhang
>
> Currently if hive.merge.[mapfiles|mapredfiles] is true and the merged output file size is determined by the input split size which is determined by mapred.min.split.size, mapred.min.split.size.per.[node|rack] and mapred.max.split.size. Sometimes it is desirable to have different output file size than the input split size.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.