You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Grisha Trubetskoy (JIRA)" <ji...@apache.org> on 2014/09/22 02:53:34 UTC

[jira] [Commented] (HIVE-5823) Support for DECIMAL primitive type in AvroSerDe

    [ https://issues.apache.org/jira/browse/HIVE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142801#comment-14142801 ] 

Grisha Trubetskoy commented on HIVE-5823:
-----------------------------------------

Just FYI this patch breaks schema evolution. If you look at this line: https://github.com/apache/hive/blob/2bb8ae7f352694f4becc9ff67e667620b2ee7fe9/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L269

{code}
 return worker(datum, fileSchema == null ? null : fileSchema.getTypes().get(tag), schema,
SchemaToTypeInfo.generateTypeInfo(schema));
{code}

The problem above is that if fileSchema has evolved from a primitive type to a union (quite common if you add a default, e.g. what used to be 
{{"type":"string"}} becomes {{"type":["null","string"], "default":null}}), then the {{getTypes()}} call above will throw a "not a union" exception. The correct logic should expect fileSchema to be either null or primitive type or a union.

This appears to be fixed in the latest trunk (as part of HIVE-6806 patch, which isn't related to the DECIMAL type in any way): https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L269

{code}
Schema currentFileSchema = null;
if (fileSchema != null) {
currentFileSchema =
fileSchema.getType() == Type.UNION ? fileSchema.getTypes().get(tag) : fileSchema;
}
return worker(datum, currentFileSchema, schema, SchemaToTypeInfo.generateTypeInfo(schema));
{code} 

> Support for DECIMAL primitive type in AvroSerDe
> -----------------------------------------------
>
>                 Key: HIVE-5823
>                 URL: https://issues.apache.org/jira/browse/HIVE-5823
>             Project: Hive
>          Issue Type: New Feature
>          Components: Serializers/Deserializers
>    Affects Versions: 0.12.0
>            Reporter: Mariano Dominguez
>            Assignee: Xuefu Zhang
>              Labels: TODOC14, avro, serde
>             Fix For: 0.14.0
>
>         Attachments: HIVE-5823.1.patch, HIVE-5823.2.patch, HIVE-5823.3.patch, HIVE-5823.4.patch, HIVE-5823.5.patch, HIVE-5823.6.patch, HIVE-5823.7.patch, HIVE-5823.patch, dec.avro
>
>
> This new feature request would be tied to AVRO-1402.
> Adding DECIMAL support would be particularly interesting when converting types from Avro to Hive, since DECIMAL is already a supported data type in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)