You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Phabricator (JIRA)" <ji...@apache.org> on 2012/12/28 02:42:13 UTC
[jira] [Updated] (HIVE-3841) Sampling in previous MR for range
partitioning of next RS
[ https://issues.apache.org/jira/browse/HIVE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-3841:
------------------------------
Attachment: HIVE-3841.D7671.1.patch
navis requested code review of "HIVE-3841 [jira] Sampling in previous MR for range partitioning of next RS".
Reviewers: JIRA
DPAL-1945 Sampling in previous MR for range partitioning of next RS
Currently hive enforces single reducer for order by clause, which can be performance bottleneck.
If sampling could be done on ordering key at previous MR stage, multiple reducers could be assigned for it.
TEST PLAN
EMPTY
REVISION DETAIL
https://reviews.facebook.net/D7671
AFFECTED FILES
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java
ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionSampler.java
ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/SampleMerger.java
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplingOptimizer.java
ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java
ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
ql/src/java/org/apache/hadoop/hive/ql/plan/SamplingContext.java
MANAGE HERALD DIFFERENTIAL RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/18381/
To: JIRA, navis
> Sampling in previous MR for range partitioning of next RS
> ---------------------------------------------------------
>
> Key: HIVE-3841
> URL: https://issues.apache.org/jira/browse/HIVE-3841
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3841.D7671.1.patch
>
>
> Currently hive enforces single reducer for order by clause, which can be performance bottleneck.
> If sampling could be done on ordering key at previous MR stage, multiple reducers could be assigned for it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira