You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/08 22:30:01 UTC

[jira] [Commented] (ORC-210) Add new ORC 2.0 encoding for Double, Float.

    [ https://issues.apache.org/jira/browse/ORC-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244857#comment-16244857 ] 

ASF GitHub Bot commented on ORC-210:
------------------------------------

GitHub user omalley opened a pull request:

    https://github.com/apache/orc/pull/189

    ORC-210 Add new encodings and benchmarks for new double encoding.

    This extends Teddy's pull request by adding:
    
        * Extended write benchmark.
        * Added read benchmark
        * Added new encodings:
            + plainV2, which is the same data format but faster implementation
            + fpcV2, which is the standard FPC implementation
            + flip, which rotates the bytes in a 8x8 matrix
            + split, which creates rle streams for sign, exponent, and mantissa
        * Added new datasets.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/omalley/orc orc-210

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/189.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #189
    
----
commit ca6021511d38a3db87ac3a6d5b086155c05abf70
Author: Teddy Choi <pu...@gmail.com>
Date:   2017-08-22T17:27:35Z

    ORC-210: Add encoding for Double, Float.

commit 7a93e7f54e978ac3efb369d538a869775468a1bc
Author: Owen O'Malley <om...@apache.org>
Date:   2017-10-26T23:41:43Z

    ORC-210. Adding new double encoding.
    
    * Moved implementations out of core until we pick one.
    * Extended write benchmark.
    * Added read benchmark
    * Added new encodings:
        + plainV2, which is the same data format but faster implementation
        + fpcV2, which is the standard FPC implementation
        + flip, which rotates the bytes in a 8x8 matrix
        + split, which creates rle streams for sign, exponent, and mantissa
    * Added new datasets.

----


> Add new ORC 2.0 encoding for Double, Float.
> -------------------------------------------
>
>                 Key: ORC-210
>                 URL: https://issues.apache.org/jira/browse/ORC-210
>             Project: ORC
>          Issue Type: Improvement
>          Components: encoding, Java
>    Affects Versions: 2.0.0
>            Reporter: Dapeng Sun
>            Assignee: Teddy Choi
>         Attachments: ORC-210.1.patch, ORC-210.2.patch, patch.txt
>
>
> Currently, Double and Float are using PLAIN encoding, it is better to support encoding such as Dictionary or BitPacking to reduce the storage cost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)