You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2009/11/13 03:44:39 UTC
[jira] Commented: (HIVE-929) hive.map.mergefiles increases the size
in some cases
[ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777348#action_12777348 ]
Namit Jain commented on HIVE-929:
---------------------------------
Currently, we use only one size:
"hive.merge.size.per.task"
whose default value is 256M.
We should add another parameter
"hive.merge.smallfiles.avgsize"
whose default value can be much smaller, say 16M.
We will only merge if the current average size of a file < "hive.merge.smallfiles.avgsize".
This will make sure that merging will happen only in very bad cases.
> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
> Key: HIVE-929
> URL: https://issues.apache.org/jira/browse/HIVE-929
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Namit Jain
>
> Due to random clustering, the size is increased in some cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.