You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chao Sun <ch...@cloudera.com> on 2014/07/31 02:00:10 UTC
Review Request 24127: Research to use groupby transformation to replace Hive
existing partitionByKey and SparkCollector combination
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24127/
-----------------------------------------------------------
Review request for hive.
Repository: hive-git
Description
-------
An attempt to fix the last patch by moving groupBy op to ShuffleTran.
Also, since now SparkTran::transform may have input/output value types other than BytesWritable, we need to make it generic as well..
Also added a CompTran class, which is basically a composition of transformations. It offers better type compatibility than ChainedTran.
This is NOT the perfect solution, and may subject to further change.
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ChainedTran.java 4991568
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CompTran.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 01a70e9
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 841db87
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/IdentityTran.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 98d08e6
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java d1af86d
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java 33e7d45
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java cf85af1
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 440dd93
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTran.java 6aa732f
Diff: https://reviews.apache.org/r/24127/diff/
Testing
-------
Thanks,
Chao Sun