You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Dong Chen <do...@intel.com> on 2015/03/03 09:28:35 UTC
Review Request 31671: HIVE-8128: Improve Parquet Vectorization
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31671/
-----------------------------------------------------------
Review request for hive, Brock Noland, cheng xu, and Sergio Pena.
Repository: hive-git
Description
-------
This is a POC based on the new vectorized Parquet API at https://github.com/zhenxiao/incubator-parquet-mr/pull/1
I check out the Parquet API code, make a little change, and then add the Hive changes. The vectorized read could work locally. Add a test to verify it.
This patch only contains the basic work. A list of TODO is commented in the code.
Any feedback is welcome!
Diffs
-----
data/files/testParquetFile PRE-CREATION
pom.xml 75a41a4
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java 6a44c27
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java c915f72
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 0391229
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java d7edd52
ql/src/test/queries/clientpositive/vectorized_parquet_data_types.q PRE-CREATION
ql/src/test/results/clientpositive/vectorized_parquet_data_types.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/31671/diff/
Testing
-------
add one test, and UT pass locally.
Thanks,
Dong Chen