You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jie Li <ji...@cs.duke.edu> on 2013/01/05 04:46:42 UTC
Map-only aggregation
Hi all,
Can Hive implement the aggregation as a Map-only job? As we know the
data may be pre-partitioned via PARTITION-BY or CLUSTERED-BY, so we
don't need the reduce phase to repartition the data.
The Bucket Join seems to take advantage of the buckets for joins, so I
wonder if there is some similar optimization for aggregations.
Thanks,
Jie