You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2012/12/24 07:06:12 UTC

[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

    [ https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539201#comment-13539201 ] 

Namit Jain commented on HIVE-3833:
----------------------------------

Consider the following test:

set hive.input.format = org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;

create table partition_test_partitioned(key string, value string) partitioned by (dt string) stored as rcfile;

alter table partition_test_partitioned set serde 'org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe';
insert overwrite table partition_test_partitioned partition(dt='1') select * from src where key = 238;

alter table partition_test_partitioned change key key int; 


The query:
select * from partition_test_partitioned where dt is not null;

returns:

50	val_238	1
50	val_238	1

This is due to the fact that the key column was serialized as a string column, and is now being read as a integer.
                
> object inspectors should be initialized based on partition metadata
> -------------------------------------------------------------------
>
>                 Key: HIVE-3833
>                 URL: https://issues.apache.org/jira/browse/HIVE-3833
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> Currently, different partitions can be picked up for the same input split based on the
> serdes' etc. And, we dont allow to change the schema for LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only if the
> partition schemas exactly match. The operator tree object inspectors should be based
> on the partition schema. That would give greater flexibility and also help using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira