You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zoltan Ivanfi (JIRA)" <ji...@apache.org> on 2018/02/14 15:35:00 UTC

[jira] [Comment Edited] (HIVE-17843) UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

    [ https://issues.apache.org/jira/browse/HIVE-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364282#comment-16364282 ] 

Zoltan Ivanfi edited comment on HIVE-17843 at 2/14/18 3:34 PM:
---------------------------------------------------------------

Sorry for the late answer. The simplest query suffices, e.g., a SELECT * on a table that contains a single column and a single row. But the parquet file has to have an unsigned integer in it and Hive does not write unsignes ints. [~gszadovszky] could you provide an example parquet file with an unsigned int that has its first bit set? Thanks!


was (Author: zi):
Sorry for the late answer. The simplest query suffices, e.g., a SELECT * on a table that contains a single column and a single row. But the parquet file has to have an unsigned integer in it and Hive does not write unsignes ints. [~gszadovszky] could you supply an example parquet file with an unsigned int that has its first bit set? Thanks!

> UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-17843
>                 URL: https://issues.apache.org/jira/browse/HIVE-17843
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zoltan Ivanfi
>            Assignee: Janaki Lahorani
>            Priority: Major
>
> An unsigned 32 bit Parquet column, such as
> {noformat}
> optional int32 uint_32_col (UINT_32)
> {noformat}
> is read by Hive as if it were signed, leading to incorrect results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)