You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Shreepadma Venugopalan (JIRA)" <ji...@apache.org> on 2013/03/12 07:29:12 UTC
[jira] [Created] (HIVE-4153) Use number of distinct values to
decide whether to perform map side aggregation
Shreepadma Venugopalan created HIVE-4153:
--------------------------------------------
Summary: Use number of distinct values to decide whether to perform map side aggregation
Key: HIVE-4153
URL: https://issues.apache.org/jira/browse/HIVE-4153
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.10.0, 0.9.0, 0.8.1, 0.8.0
Reporter: Shreepadma Venugopalan
Today, Hive decides to perform a map side aggregation by default. If the number of unique keys in the aggregation is small, performing a map side aggregation is beneficial. However, if the number of keys is sufficiently large, it can lead to OOMEs. Upon encountering an OOME, hive.map.aggr has be set to false to turn it off. Instead, we can use the number of distinct values in the group by column along with the number of rows in the table to decide if map side aggregation should be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira