You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@orc.apache.org by om...@apache.org on 2020/03/27 22:30:16 UTC

[orc] branch master updated: ORC-519: Correct encoding for Decimal columns in specification

This is an automated email from the ASF dual-hosted git repository.

omalley pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/master by this push:
     new a9ec6a2  ORC-519: Correct encoding for Decimal columns in specification
a9ec6a2 is described below

commit a9ec6a2e39ed71ef8a2d874df14700956aa847be
Author: Norbert Luksa <no...@cloudera.com>
AuthorDate: Fri Feb 21 15:53:16 2020 +0100

    ORC-519: Correct encoding for Decimal columns in specification
    
    Currently the ORC specifications state that the secondary stream of
    decimal columns are Unsigned RLE, but according to the code (both
    Java and C++), signed RLE is used.
    
    Fixes #484
    
    Signed-off-by: Owen O'Malley <om...@apache.org>
---
 site/specification/ORCv0.md | 4 ++--
 site/specification/ORCv1.md | 6 +++---
 site/specification/ORCv2.md | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/site/specification/ORCv0.md b/site/specification/ORCv0.md
index 2ca58b6..861b819 100644
--- a/site/specification/ORCv0.md
+++ b/site/specification/ORCv0.md
@@ -664,13 +664,13 @@ the precision to a maximum of 38 digits, which conveniently uses 127
 bits plus a sign bit. The current encoding of decimal columns stores
 the integer representation of the value as an unbounded length zigzag
 encoded base 128 varint. The scale is stored in the SECONDARY stream
-as an signed integer.
+as a signed integer.
 
 Encoding      | Stream Kind     | Optional | Contents
 :------------ | :-------------- | :------- | :-------
 DIRECT        | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Unbounded base 128 varints
-              | SECONDARY       | No       | Unsigned Integer RLE v1
+              | SECONDARY       | No       | Signed Integer RLE v1
 
 ## Date Columns
 
diff --git a/site/specification/ORCv1.md b/site/specification/ORCv1.md
index db63356..812c962 100644
--- a/site/specification/ORCv1.md
+++ b/site/specification/ORCv1.md
@@ -1097,16 +1097,16 @@ the precision to a maximum of 38 digits, which conveniently uses 127
 bits plus a sign bit. The current encoding of decimal columns stores
 the integer representation of the value as an unbounded length zigzag
 encoded base 128 varint. The scale is stored in the SECONDARY stream
-as an signed integer.
+as a signed integer.
 
 Encoding      | Stream Kind     | Optional | Contents
 :------------ | :-------------- | :------- | :-------
 DIRECT        | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Unbounded base 128 varints
-              | SECONDARY       | No       | Unsigned Integer RLE v1
+              | SECONDARY       | No       | Signed Integer RLE v1
 DIRECT_V2     | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Unbounded base 128 varints
-              | SECONDARY       | No       | Unsigned Integer RLE v2
+              | SECONDARY       | No       | Signed Integer RLE v2
 
 ## Date Columns
 
diff --git a/site/specification/ORCv2.md b/site/specification/ORCv2.md
index b84dac9..74eb567 100644
--- a/site/specification/ORCv2.md
+++ b/site/specification/ORCv2.md
@@ -1121,7 +1121,7 @@ DIRECT        | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Signed Integer RLE v2
 DIRECT_V2     | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Unbounded base 128 varints
-              | SECONDARY       | No       | Unsigned Integer RLE v2
+              | SECONDARY       | No       | Signed Integer RLE v2
 
 
 ## Date Columns