You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Patrick Duin <pa...@gmail.com> on 2018/07/26 09:22:42 UTC

Parquet schema evolution, column conversion not supported

I'm encountering errors in Hive 2.3.2 when reading sets of Parquet files,
where the schema has evolved.

The error I'm seeing is :
Failed with exception java.io.IOException:java.lang.RuntimeException: Hive
internal error: conversion of string to array<string>not supported yet.

My schema has a top-level column of struct type: that has changed from:

myColumn struct<c1:string, c2:string, c3:string>

To

myColumn struct<c1:string, c2:string, new_column:array<string>, c3:string>

I've update my table with the new column type using the DDL below but then
see the aforementioned error when selecting the data.

I've tried to force column lookup by name rather than by index using the
setting:

parquet.column.index.access=false

But I see the same error. Are these kind of schema evolutions supported
(nested column insertion)? What are my options for resolving this issue?

Many thanks,

Patrick.

Re: Parquet schema evolution, column conversion not supported

Posted by Patrick Duin <pa...@gmail.com>.
Replying to myself as I found my issue, I hadn't updated the schema of my
partitions correctly, I've only updated the table schema, the error went
away when I updated my partitions. All data was query-able old and newly
landed data.



Op do 26 jul. 2018 om 11:22 schreef Patrick Duin <pa...@gmail.com>:

>
> I'm encountering errors in Hive 2.3.2 when reading sets of Parquet files,
> where the schema has evolved.
>
> The error I'm seeing is :
> Failed with exception java.io.IOException:java.lang.RuntimeException: Hive
> internal error: conversion of string to array<string>not supported yet.
>
> My schema has a top-level column of struct type: that has changed from:
>
> myColumn struct<c1:string, c2:string, c3:string>
>
> To
>
> myColumn struct<c1:string, c2:string, new_column:array<string>, c3:string>
>
> I've update my table with the new column type using the DDL below but then
> see the aforementioned error when selecting the data.
>
> I've tried to force column lookup by name rather than by index using the
> setting:
>
> parquet.column.index.access=false
>
> But I see the same error. Are these kind of schema evolutions supported
> (nested column insertion)? What are my options for resolving this issue?
>
> Many thanks,
>
> Patrick.
>