You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Barna Zsombor Klara (JIRA)" <ji...@apache.org> on 2017/04/28 10:07:04 UTC
[jira] [Created] (HIVE-16559) Parquet schema evolution for
partitioned tables may break if table and partition serdes differ
Barna Zsombor Klara created HIVE-16559:
------------------------------------------
Summary: Parquet schema evolution for partitioned tables may break if table and partition serdes differ
Key: HIVE-16559
URL: https://issues.apache.org/jira/browse/HIVE-16559
Project: Hive
Issue Type: Bug
Reporter: Barna Zsombor Klara
Assignee: Barna Zsombor Klara
Parquet schema evolution should make it possible to have partitions/tables
backed by files with different schemas. Hive should match the table columns with file columns based on the column name if possible.
However if the serde for a table is missing columns from the serde of a partition Hive fails to match the columns together.
Steps to reproduce:
{code}
CREATE TABLE myparquettable_parted
(
name string,
favnumber int,
favcolor string,
age int,
favpet string
)
PARTITIONED BY (day string)
STORED AS PARQUET;
INSERT OVERWRITE TABLE myparquettable_parted
PARTITION(day='2017-04-04')
SELECT
'mary' as name,
5 AS favnumber,
'blue' AS favcolor,
35 AS age,
'dog' AS favpet;
REPLACE COLUMNS
(
favnumber int,
age int
); <!--- No cascade option, so the partition will not be altered.
{code}
{{SELECT * FROM myparquettable_parted where day='2017-04-04';}}
will fail with:
{{java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.io.IntWritable}}
Hive should either match the columns together or prevent the user from dropping columns from the table.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)