You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@orc.apache.org by do...@apache.org on 2021/09/21 20:38:12 UTC

[orc] branch main updated: ORC-992: Reached max repeat length, we can directly decide to use DELTA encoding (#907)

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/main by this push:
     new b745fdd  ORC-992: Reached max repeat length, we can directly decide to use DELTA encoding (#907)
b745fdd is described below

commit b745fddb7fa83bbd0ac74f69706c4f2aa86d0751
Author: Guiyanakaung <gu...@gmail.com>
AuthorDate: Wed Sep 22 04:38:08 2021 +0800

    ORC-992: Reached max repeat length, we can directly decide to use DELTA encoding (#907)
    
    ### What changes were proposed in this pull request?
    
    Reached max repeat length, we can directly decide to use DELTA encoding.
    The computeZigZagLiterals, zzBits100p, and determine isFixedDelta within the determineEncoding method are all redundant in the current case.
    
    Similar practice.
    RunLengthIntegerWriterV2.java 767-775
    ```java
              if (fixedRunLength >= MIN_REPEAT) {
                if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
                  encoding = EncodingType.SHORT_REPEAT;
                } else {
                  encoding = EncodingType.DELTA;
                  isFixedDelta = true;
                }
                writeValues();
              }
    ```
    
    ### Why are the changes needed?
    
    Optimization for reached max repeat length.
    
    ### How was this patch tested?
    
    Pass the CIs.
---
 c++/src/RleEncoderV2.cc                                              | 3 ++-
 java/core/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/c++/src/RleEncoderV2.cc b/c++/src/RleEncoderV2.cc
index bdae97a..2c6a826 100644
--- a/c++/src/RleEncoderV2.cc
+++ b/c++/src/RleEncoderV2.cc
@@ -124,7 +124,8 @@ void RleEncoderV2::write(int64_t val) {
         }
 
         if (fixedRunLength == MAX_LITERAL_SIZE) {
-            determineEncoding(option);
+            option.encoding = DELTA;
+            option.isFixedDelta = true;
             writeValues(option);
         }
         return;
diff --git a/java/core/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java b/java/core/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java
index 5ddc8ec..5a6dd51 100644
--- a/java/core/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java
+++ b/java/core/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java
@@ -764,7 +764,8 @@ public class RunLengthIntegerWriterV2 implements IntegerWriter {
 
           // if fixed runs reached max repeat length then write values
           if (fixedRunLength == MAX_SCOPE) {
-            determineEncoding();
+            encoding = EncodingType.DELTA;
+            isFixedDelta = true;
             writeValues();
           }
         } else {