You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Matt McCline (JIRA)" <ji...@apache.org> on 2018/03/21 23:27:00 UTC
[jira] [Updated] (HIVE-19015) Vectorization and Parquet: When
vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException
[ https://issues.apache.org/jira/browse/HIVE-19015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline updated HIVE-19015:
--------------------------------
Description:
Adding "SET hive.vectorized.execution.enabled=true;" to parquet_map_of_arrays_of_ints.q triggers this call stack:
{noformat}
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}
FYI: [~vihangk1]
Adding parquet_map_of_maps.q, too. Stack trace seems related.
{noformat}
Caused by: java.lang.ClassCastException: optional group value (MAP) {
repeated group key_value {
optional binary key (UTF8);
required int32 value;
}
} is not primitive
at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) ~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
at org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.<init>(BaseVectorizedColumnReader.java:130) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.<init>(VectorizedListColumnReader.java:52) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}
was:
Adding "SET hive.vectorized.execution.enabled=true;" to parquet_map_of_arrays_of_ints.q triggers this call stack:
{noformat}
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{noformat}
FYI: [~vihangk1]
> Vectorization and Parquet: When vectorized, parquet_map_of_arrays_of_ints.q gets a ClassCastException
> -----------------------------------------------------------------------------------------------------
>
> Key: HIVE-19015
> URL: https://issues.apache.org/jira/browse/HIVE-19015
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 3.0.0
> Reporter: Matt McCline
> Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to parquet_map_of_arrays_of_ints.q triggers this call stack:
> {noformat}
> Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:67) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]
> Adding parquet_map_of_maps.q, too. Stack trace seems related.
> {noformat}
> Caused by: java.lang.ClassCastException: optional group value (MAP) {
> repeated group key_value {
> optional binary key (UTF8);
> required int32 value;
> }
> } is not primitive
> at org.apache.parquet.schema.Type.asPrimitiveType(Type.java:213) ~[parquet-hadoop-bundle-1.9.0.jar:1.9.0]
> at org.apache.hadoop.hive.ql.io.parquet.vector.BaseVectorizedColumnReader.<init>(BaseVectorizedColumnReader.java:130) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.<init>(VectorizedListColumnReader.java:52) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:568) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)