You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chao Sun <ch...@cloudera.com> on 2014/07/31 02:00:10 UTC

Review Request 24127: Research to use groupby transformation to replace Hive existing partitionByKey and SparkCollector combination

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24127/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

An attempt to fix the last patch by moving groupBy op to ShuffleTran.
Also, since now SparkTran::transform may have input/output value types other than BytesWritable, we need to make it generic as well..
Also added a CompTran class, which is basically a composition of transformations. It offers better type compatibility than ChainedTran.
This is NOT the perfect solution, and may subject to further change.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ChainedTran.java 4991568 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CompTran.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 01a70e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 841db87 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/IdentityTran.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 98d08e6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java d1af86d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java 33e7d45 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java cf85af1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 440dd93 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTran.java 6aa732f 

Diff: https://reviews.apache.org/r/24127/diff/


Testing
-------


Thanks,

Chao Sun