You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "David Mollitor (Jira)" <ji...@apache.org> on 2021/12/23 19:48:00 UTC
[jira] [Assigned] (ORC-1063) Avoid ORC Reader Max Length Confusion
[ https://issues.apache.org/jira/browse/ORC-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mollitor reassigned ORC-1063:
-----------------------------------
> Avoid ORC Reader Max Length Confusion
> -------------------------------------
>
> Key: ORC-1063
> URL: https://issues.apache.org/jira/browse/ORC-1063
> Project: ORC
> Issue Type: Improvement
> Components: Java
> Affects Versions: 1.7.0
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
>
> I just came across this confusion in the wild (i.e. production system).
> {code:java|title=ReaderImpl.java}
> @Override
> public String toString() {
> StringBuilder buffer = new StringBuilder();
> buffer.append("ORC Reader(");
> buffer.append(path);
> if (maxLength != -1) {
> buffer.append(", ");
> buffer.append(maxLength);
> }
> buffer.append(")");
> return buffer.toString();
> }
> {code}
> {code:java|title=OrcConf.java}
> MAX_FILE_LENGTH("orc.max.file.length", "orc.max.file.length", Long.MAX_VALUE,
> "The maximum size of the file to read for finding the file tail. This\n" +
> "is primarily used for streaming ingest to read intermediate\n" +
> "footers while the file is still open"),
> {code}
> https://github.com/apache/orc/blob/883aae8757257a8314c0ece07e5ef0238600717c/java/core/src/java/org/apache/orc/impl/ReaderImpl.java#L1107-L1109
> There seems to be some confusion here about how to set this value to "there is no maximum value." The configuration denotes {{MAX_VALUE}} as having no value, but the {{toString()}} code is expecting "no maximum value" to be equal to -1. I came across this because I saw some logging that indicated that I had a file that was of length ~9000PB. This did not make any sense and was confusing.
> I suggest changing this to be any value less than 0 denotes "no maximum" and to use a Java {{Optional}} to avoid this confusion again.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)