You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2021/08/10 18:27:00 UTC
[jira] [Updated] (ORC-931) Optimize RunLengthIntegerWriterV2 code
for better readability
[ https://issues.apache.org/jira/browse/ORC-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated ORC-931:
------------------------------
Affects Version/s: 1.8.0
1.7.0
> Optimize RunLengthIntegerWriterV2 code for better readability
> -------------------------------------------------------------
>
> Key: ORC-931
> URL: https://issues.apache.org/jira/browse/ORC-931
> Project: ORC
> Issue Type: Improvement
> Affects Versions: 1.7.0, 1.8.0
> Reporter: Yiqun Zhang
> Assignee: Yiqun Zhang
> Priority: Trivial
> Fix For: 1.7.0, 1.8.0
>
>
> RunLengthIntegerWriterV2.java
> 512-546 line
> {code:java}
> if (diffBitsLH > 1) {
> for (int i = 0; i < numLiterals; i++) {
> baseRedLiterals[i] = literals[i] - min;
> }
> brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95);
> brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0);
> if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) {
> encoding = EncodingType.PATCHED_BASE;
> preparePatchedBlob();
> return;
> } else {
> encoding = EncodingType.DIRECT;
> return;
> }
> } else {
> // if difference in bits between 95th percentile and 100th percentile is
> // 0, then patch length will become 0. Hence we will fallback to direct
> encoding = EncodingType.DIRECT;
> return;
> }
> {code}
> All three conditional branch logics have been completed and the return statement is redundant.
> 691-704 line
> {code:java}
> if (fixedRunLength < MIN_REPEAT) {
> variableRunLength = fixedRunLength;
> fixedRunLength = 0;
> determineEncoding();
> writeValues();
> } else if (fixedRunLength >= MIN_REPEAT
> && fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
> encoding = EncodingType.SHORT_REPEAT;
> writeValues();
> } else {
> encoding = EncodingType.DELTA;
> isFixedDelta = true;
> writeValues();
> }
> {code}
> fixedRunLength >= MIN_REPEAT is redundant, the previous condition already ensures this. Extract the writeValues() method to the end. It seems better for conditional judgements to deal only with encoding and state.
> 772-781 line
> {code:java}
> if (fixedRunLength >= MIN_REPEAT) {
> if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
> encoding = EncodingType.SHORT_REPEAT;
> writeValues();
> } else {
> encoding = EncodingType.DELTA;
> isFixedDelta = true;
> writeValues();
> }
> }
> {code}
> Ditto
--
This message was sent by Atlassian Jira
(v8.3.4#803005)