You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2021/08/10 18:27:00 UTC

[jira] [Resolved] (ORC-931) Optimize RunLengthIntegerWriterV2 code for better readability

     [ https://issues.apache.org/jira/browse/ORC-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun resolved ORC-931.
-------------------------------
    Fix Version/s: 1.8.0
                   1.7.0
         Assignee: Yiqun Zhang
       Resolution: Fixed

This is resolved via https://github.com/apache/orc/pull/845

> Optimize RunLengthIntegerWriterV2 code for better readability
> -------------------------------------------------------------
>
>                 Key: ORC-931
>                 URL: https://issues.apache.org/jira/browse/ORC-931
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Yiqun Zhang
>            Assignee: Yiqun Zhang
>            Priority: Trivial
>             Fix For: 1.7.0, 1.8.0
>
>
> RunLengthIntegerWriterV2.java
> 512-546 line
> {code:java}
>   if (diffBitsLH > 1) {
>       for (int i = 0; i < numLiterals; i++) {
>         baseRedLiterals[i] = literals[i] - min;
>       }
>       brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95);
>       brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0);
>       if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) {
>         encoding = EncodingType.PATCHED_BASE;
>         preparePatchedBlob();
>         return;
>       } else {
>         encoding = EncodingType.DIRECT;
>         return;
>       }
>     } else {
>       // if difference in bits between 95th percentile and 100th percentile is
>       // 0, then patch length will become 0. Hence we will fallback to direct
>       encoding = EncodingType.DIRECT;
>       return;
>     }
> {code}
> All three conditional branch logics have been completed and the return statement is redundant.
> 691-704 line
> {code:java}
>       if (fixedRunLength < MIN_REPEAT) {
>           variableRunLength = fixedRunLength;
>           fixedRunLength = 0;
>           determineEncoding();
>           writeValues();
>         } else if (fixedRunLength >= MIN_REPEAT
>             && fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
>           encoding = EncodingType.SHORT_REPEAT;
>           writeValues();
>         } else {
>           encoding = EncodingType.DELTA;
>           isFixedDelta = true;
>           writeValues();
>         }
> {code}
> fixedRunLength >= MIN_REPEAT is redundant, the previous condition already ensures this.  Extract the writeValues() method to the end. It seems better for conditional judgements to deal only with encoding and state.
> 772-781 line
> {code:java}
>           if (fixedRunLength >= MIN_REPEAT) {
>             if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
>               encoding = EncodingType.SHORT_REPEAT;
>               writeValues();
>             } else {
>               encoding = EncodingType.DELTA;
>               isFixedDelta = true;
>               writeValues();
>             }
>           }
> {code}
> Ditto



--
This message was sent by Atlassian Jira
(v8.3.4#803005)