You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Thomas Friedrich (JIRA)" <ji...@apache.org> on 2015/07/21 09:15:04 UTC

[jira] [Created] (HIVE-11326) Parquet table: where clause with partition column fails

Thomas Friedrich created HIVE-11326:
---------------------------------------

             Summary: Parquet table: where clause with partition column fails
                 Key: HIVE-11326
                 URL: https://issues.apache.org/jira/browse/HIVE-11326
             Project: Hive
          Issue Type: Bug
          Components: Query Planning
    Affects Versions: 1.2.0, 1.2.1
            Reporter: Thomas Friedrich


Steps:
create table t1 (c1 int) partitioned by (part string) stored as parquet;
insert into table t1 partition (part='p1') values (1);
select * from t1 where part='p1';

Error message:
Caused by: java.lang.IllegalArgumentException: Column [part] was not found in schema!
        at parquet.Preconditions.checkArgument(Preconditions.java:55)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:190)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:178)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:160)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:94)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:59)
        at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:180)
        at parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:64)
        at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:59)
        at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:40)
        at parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:126)
        at parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:46)
        at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:275)
        at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:99)
        at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:85)
        at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)

Seems that problem was introduced with HIVE-10252 ([~dongc]). Filter can't contain any partition columns in case of Parquet table. 
While searching for an existing JIRA, I found a similar problem reported for Spark - SPARK-6554

I think the setFilter method should remove all predicates that reference partition columns before building the FilterPredicate object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)