You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2018/03/28 14:28:00 UTC

[jira] [Created] (HIVE-19069) Hive can't read int32 and int64 Parquet decimal values

Vitalii Diravka created HIVE-19069:
--------------------------------------

             Summary: Hive can't read int32 and int64 Parquet decimal values
                 Key: HIVE-19069
                 URL: https://issues.apache.org/jira/browse/HIVE-19069
             Project: Hive
          Issue Type: Improvement
          Components: Types
    Affects Versions: 2.3.2
            Reporter: Vitalii Diravka
         Attachments: 0_0_0.parquet

Parquet supports several minor types for Decimal ligical data type:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

But Hive supports only "fixed_len_byte_array":
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L423

After creating parquet external table and quering it via Hive:
{code}
hive> select * from decimal_parquet;
OK
Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file maprfs:///tmp/decimal_parquet/0_0_0.parquet
{code}

The sample of parquet file with decimal int32 values is added to the jira:
{code}
vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /tmp/decimal_parquet/0_0_0.parquet 
message root {
  optional binary a (UTF8);
  optional int32 b (DECIMAL(7,2));
}

vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar cat /tmp/md4107_par/0_0_0.parquet 
a = a
b = 100
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)