You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Yiqun Zhang (Jira)" <ji...@apache.org> on 2021/09/14 05:39:00 UTC

[jira] [Created] (ORC-992) Reached max repeat length, we can directly decide to use DELTA encoding

Yiqun Zhang created ORC-992:
-------------------------------

             Summary: Reached max repeat length, we can directly decide to use DELTA encoding
                 Key: ORC-992
                 URL: https://issues.apache.org/jira/browse/ORC-992
             Project: ORC
          Issue Type: Improvement
          Components: Java
    Affects Versions: 1.7.0
            Reporter: Yiqun Zhang
             Fix For: 1.7.0


Reached max repeat length, we can directly decide to use DELTA encoding.
RunLengthIntegerWriterV2.java  756-760
{code:java}
          // if fixed runs reached max repeat length then write values
          if (fixedRunLength == MAX_SCOPE) {
            determineEncoding();
            writeValues();
          }
{code}
If fixed runs reached max repeat length. We have been able to determine the use of the DELTA code, fixedDelta is zero. 

The computeZigZagLiterals, zzBits100p, and determine isFixedDelta within the determineEncoding method are all redundant in the current case.

Similar practices.
RunLengthIntegerWriterV2.java  767-775

{code:java}
          if (fixedRunLength >= MIN_REPEAT) {
            if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
              encoding = EncodingType.SHORT_REPEAT;
            } else {
              encoding = EncodingType.DELTA;
              isFixedDelta = true;
            }
            writeValues();
          }
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)