You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Barna Zsombor Klara (JIRA)" <ji...@apache.org> on 2017/06/21 10:09:00 UTC
[jira] [Comment Edited] (HIVE-16559) Parquet schema evolution for
partitioned tables may break if table and partition serdes differ
[ https://issues.apache.org/jira/browse/HIVE-16559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057286#comment-16057286 ]
Barna Zsombor Klara edited comment on HIVE-16559 at 6/21/17 10:08 AM:
----------------------------------------------------------------------
Failures are unrelated:
- HIVE-16908 - for HCat failures
- HIVE-16785 - is taking care of replication failures
- HIVE-15776 - for vector_if_expr
- HIVE-16931 - created for the failing PerfTests as they have been failing for close to 100 runs.
orc_ppd_basic and explainanalyze_2 are passing locally.
was (Author: zsombor.klara):
Failures are unrelated:
- HIVE-16908 - for HCat failures
- HIVE-16785 - is taking care of replication failures
- HIVE-15776 - for vector_if_expr
- HIVE-16931 - created for the failing PerfTests as they have been failing for close to 100 runs.
> Parquet schema evolution for partitioned tables may break if table and partition serdes differ
> ----------------------------------------------------------------------------------------------
>
> Key: HIVE-16559
> URL: https://issues.apache.org/jira/browse/HIVE-16559
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Barna Zsombor Klara
> Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-16559.01.patch, HIVE-16559.02.patch, HIVE-16559.03.patch, HIVE-16559.04.patch, HIVE-16559.05.patch
>
>
> Parquet schema evolution should make it possible to have partitions/tables
> backed by files with different schemas. Hive should match the table columns with file columns based on the column name if possible.
> However if the serde for a table is missing columns from the serde of a partition Hive fails to match the columns together.
> Steps to reproduce:
> {code}
> CREATE TABLE myparquettable_parted
> (
> name string,
> favnumber int,
> favcolor string,
> age int,
> favpet string
> )
> PARTITIONED BY (day string)
> STORED AS PARQUET;
> INSERT OVERWRITE TABLE myparquettable_parted
> PARTITION(day='2017-04-04')
> SELECT
> 'mary' as name,
> 5 AS favnumber,
> 'blue' AS favcolor,
> 35 AS age,
> 'dog' AS favpet;
> alter table myparquettable_parted
> REPLACE COLUMNS
> (
> favnumber int,
> age int
> ); <!--- No cascade option, so the partition will not be altered.
> {code}
> {{SELECT * FROM myparquettable_parted where day='2017-04-04';}}
> will fail with:
> {{java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.io.IntWritable}}
> Hive should either match the columns together or prevent the user from dropping columns from the table.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)