You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kent Yao (Jira)" <ji...@apache.org> on 2021/03/26 05:37:00 UTC
[jira] [Assigned] (SPARK-34816) Support for Parquet unsigned
LogicalTypes
[ https://issues.apache.org/jira/browse/SPARK-34816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kent Yao reassigned SPARK-34816:
--------------------------------
Assignee: Kent Yao
> Support for Parquet unsigned LogicalTypes
> -----------------------------------------
>
> Key: SPARK-34816
> URL: https://issues.apache.org/jira/browse/SPARK-34816
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: Kent Yao
> Assignee: Kent Yao
> Priority: Major
>
> Parquet supports some unsigned datatypes. Here is the definition related in parquet.thrift
> {code:java}
> /**
> * Common types used by frameworks(e.g. hive, pig) using parquet. This helps map
> * between types in those frameworks to the base types in parquet. This is only
> * metadata and not needed to read or write the data.
> */
> /**
> * An unsigned integer value.
> *
> * The number describes the maximum number of meaningful data bits in
> * the stored value. 8, 16 and 32 bit values are stored using the
> * INT32 physical type. 64 bit values are stored using the INT64
> * physical type.
> *
> */
> UINT_8 = 11;
> UINT_16 = 12;
> UINT_32 = 13;
> UINT_64 = 14;
> {code}
> Spark does not support unsigned datatypes. In SPARK-10113, we emit an exception with a clear message for them.
> UInt8-[0:255]
> UInt16-[0:65535]
> UInt32-[0:4294967295]
> UInt64-[0:18446744073709551615]
> Unsigned types - may be used to produce smaller in-memory representations of the data. If the stored value is larger than the maximum allowed by int32 or int64, then the behavior is undefined.
> In this ticket, we try to read them as a higher precision signed type
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org