You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "H. Vetinari (JIRA)" <ji...@apache.org> on 2019/07/17 10:28:00 UTC
[jira] [Created] (HIVE-22005) Handle complete parquet specification
H. Vetinari created HIVE-22005:
----------------------------------
Summary: Handle complete parquet specification
Key: HIVE-22005
URL: https://issues.apache.org/jira/browse/HIVE-22005
Project: Hive
Issue Type: Improvement
Affects Versions: All Versions
Reporter: H. Vetinari
Hive cannot read parquet files written by (default-)spark after 1.4, which uses some other internal representation, but stay faithful to the parquet specification (see SPARK-20297).
Hive should be able to read such data written by spark, plus ideally other parquet formats (arrow, etc?) that follow the spec.
Quote from SPARK-20297:
> The standard doesn't say that smaller decimals *have* to be stored in int32/int64, it just is an option for subset of decimal types. int32 and int64 are valid representations for a subset of decimal types. fixed_len_byte_array and binary are a valid representation of any decimal type.
Arguably, this is a subtask of HIVE-12398.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)