You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2010/08/19 20:57:16 UTC
[jira] Commented: (HIVE-1567) increase hive.mapjoin.maxsize to 10
million
[ https://issues.apache.org/jira/browse/HIVE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900397#action_12900397 ]
Ning Zhang commented on HIVE-1567:
----------------------------------
The hive.mapjoin.maxsize is there not for speed, it is for limiting memory consumption. We saw OOM exceptions quite a lot before this parameter was introduced. Rather than increasing it blindly a better way may be to estimate how many rows can be fit into memory based on the row size and available memory and adjusting this parameter automatically.
> increase hive.mapjoin.maxsize to 10 million
> -------------------------------------------
>
> Key: HIVE-1567
> URL: https://issues.apache.org/jira/browse/HIVE-1567
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: He Yongqiang
>
> i saw in a very wide table, hive can process 1million rows in less than one minute (select all columns).
> setting the hive.mapjoin.maxsize to 100k is kind of too restrictive. Let's increase this to 10 million.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.