You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/08 22:30:01 UTC
[jira] [Commented] (ORC-210) Add new ORC 2.0 encoding for Double,
Float.
[ https://issues.apache.org/jira/browse/ORC-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244857#comment-16244857 ]
ASF GitHub Bot commented on ORC-210:
------------------------------------
GitHub user omalley opened a pull request:
https://github.com/apache/orc/pull/189
ORC-210 Add new encodings and benchmarks for new double encoding.
This extends Teddy's pull request by adding:
* Extended write benchmark.
* Added read benchmark
* Added new encodings:
+ plainV2, which is the same data format but faster implementation
+ fpcV2, which is the standard FPC implementation
+ flip, which rotates the bytes in a 8x8 matrix
+ split, which creates rle streams for sign, exponent, and mantissa
* Added new datasets.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/omalley/orc orc-210
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/orc/pull/189.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #189
----
commit ca6021511d38a3db87ac3a6d5b086155c05abf70
Author: Teddy Choi <pu...@gmail.com>
Date: 2017-08-22T17:27:35Z
ORC-210: Add encoding for Double, Float.
commit 7a93e7f54e978ac3efb369d538a869775468a1bc
Author: Owen O'Malley <om...@apache.org>
Date: 2017-10-26T23:41:43Z
ORC-210. Adding new double encoding.
* Moved implementations out of core until we pick one.
* Extended write benchmark.
* Added read benchmark
* Added new encodings:
+ plainV2, which is the same data format but faster implementation
+ fpcV2, which is the standard FPC implementation
+ flip, which rotates the bytes in a 8x8 matrix
+ split, which creates rle streams for sign, exponent, and mantissa
* Added new datasets.
----
> Add new ORC 2.0 encoding for Double, Float.
> -------------------------------------------
>
> Key: ORC-210
> URL: https://issues.apache.org/jira/browse/ORC-210
> Project: ORC
> Issue Type: Improvement
> Components: encoding, Java
> Affects Versions: 2.0.0
> Reporter: Dapeng Sun
> Assignee: Teddy Choi
> Attachments: ORC-210.1.patch, ORC-210.2.patch, patch.txt
>
>
> Currently, Double and Float are using PLAIN encoding, it is better to support encoding such as Dictionary or BitPacking to reduce the storage cost.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)