You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Matt McCline (JIRA)" <ji...@apache.org> on 2017/08/03 05:56:00 UTC

[jira] [Comment Edited] (ORC-209) Add Decimal64 Serialization/Deserialization

    [ https://issues.apache.org/jira/browse/ORC-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112102#comment-16112102 ] 

Matt McCline edited comment on ORC-209 at 8/3/17 5:55 AM:
----------------------------------------------------------

Ok, added RB for https://issues.apache.org/jira/browse/HIVE-17235 storage-api changes https://reviews.apache.org/r/61398/

No ORC RB repository -- so I guess I need to look at how to do pull requests for ORC....


was (Author: mmccline):
Ok, added RB for https://issues.apache.org/jira/browse/HIVE-17235 storage-api changes.

No ORC RB repository -- so I guess I need to look at how to do pull requests for ORC....

> Add Decimal64 Serialization/Deserialization
> -------------------------------------------
>
>                 Key: ORC-209
>                 URL: https://issues.apache.org/jira/browse/ORC-209
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: ORC-209.01.wip.patch, ORC-209.02.wip.patch, ORC-209.03.patch, storage-api.01.wip.patch, storage-api.02.wip.patch
>
>
> Currently, HiveDecimal is serialized in ORC in a special binary bytes format as the "value" stream and a secondary stream with the scale for each decimal.  The decimal has trailing zeroes removed and the scale can vary for each decimal.  This format has CPU and storage space (i.e. compression) inefficiencies.
> The decimal type has a fixed precision and scale.  Gopal/Prasanth/Owen have suggested storing the decimals with the trailing zeroes (so the scale is a constant value for the file from the metadata) and store it as an integer stream that can benefit from run-length encoding compression, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)