You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Phabricator (JIRA)" <ji...@apache.org> on 2012/12/28 02:42:13 UTC

[jira] [Updated] (HIVE-3841) Sampling in previous MR for range partitioning of next RS

     [ https://issues.apache.org/jira/browse/HIVE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phabricator updated HIVE-3841:
------------------------------

    Attachment: HIVE-3841.D7671.1.patch

navis requested code review of "HIVE-3841 [jira] Sampling in previous MR for range partitioning of next RS".
Reviewers: JIRA

  DPAL-1945 Sampling in previous MR for range partitioning of next RS

  Currently hive enforces single reducer for order by clause, which can be performance bottleneck.

  If sampling could be done on ordering key at previous MR stage, multiple reducers could be assigned for it.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D7671

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionSampler.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SampleMerger.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplingOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/SamplingContext.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/18381/

To: JIRA, navis

                
> Sampling in previous MR for range partitioning of next RS
> ---------------------------------------------------------
>
>                 Key: HIVE-3841
>                 URL: https://issues.apache.org/jira/browse/HIVE-3841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3841.D7671.1.patch
>
>
> Currently hive enforces single reducer for order by clause, which can be performance bottleneck. 
> If sampling could be done on ordering key at previous MR stage, multiple reducers could be assigned for it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira