You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Yiqun Zhang (Jira)" <ji...@apache.org> on 2021/08/10 09:24:00 UTC
[jira] [Created] (ORC-931) Optimize RunLengthIntegerWriterV2 code
for better readability
Yiqun Zhang created ORC-931:
-------------------------------
Summary: Optimize RunLengthIntegerWriterV2 code for better readability
Key: ORC-931
URL: https://issues.apache.org/jira/browse/ORC-931
Project: ORC
Issue Type: Improvement
Reporter: Yiqun Zhang
RunLengthIntegerWriterV2.java
512-546 line
{code:java}
if (diffBitsLH > 1) {
for (int i = 0; i < numLiterals; i++) {
baseRedLiterals[i] = literals[i] - min;
}
brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95);
brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0);
if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) {
encoding = EncodingType.PATCHED_BASE;
preparePatchedBlob();
return;
} else {
encoding = EncodingType.DIRECT;
return;
}
} else {
// if difference in bits between 95th percentile and 100th percentile is
// 0, then patch length will become 0. Hence we will fallback to direct
encoding = EncodingType.DIRECT;
return;
}
{code}
All three conditional branch logics have been completed and the return statement is redundant.
691-704 line
{code:java}
if (fixedRunLength < MIN_REPEAT) {
variableRunLength = fixedRunLength;
fixedRunLength = 0;
determineEncoding();
writeValues();
} else if (fixedRunLength >= MIN_REPEAT
&& fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
encoding = EncodingType.SHORT_REPEAT;
writeValues();
} else {
encoding = EncodingType.DELTA;
isFixedDelta = true;
writeValues();
}
{code}
fixedRunLength >= MIN_REPEAT is redundant, the previous condition already ensures this. Extract the writeValues() method to the end. It seems better for conditional judgements to deal only with encoding and state.
772-781 line
{code:java}
if (fixedRunLength >= MIN_REPEAT) {
if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
encoding = EncodingType.SHORT_REPEAT;
writeValues();
} else {
encoding = EncodingType.DELTA;
isFixedDelta = true;
writeValues();
}
}
{code}
Ditto
--
This message was sent by Atlassian Jira
(v8.3.4#803005)