You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Norbert Luksa (Jira)" <ji...@apache.org> on 2020/02/14 14:19:00 UTC
[jira] [Created] (ORC-601) Add more debug info to error messages in
the scanner
Norbert Luksa created ORC-601:
---------------------------------
Summary: Add more debug info to error messages in the scanner
Key: ORC-601
URL: https://issues.apache.org/jira/browse/ORC-601
Project: ORC
Issue Type: Improvement
Reporter: Norbert Luksa
There are some exceptions which would be easier to debug if we had more debug info at hand. For instance, one frequently encountered error when Impala has stale metadata of an ORC file is:
{code:java}
Invalid ORC postscript length
{code}
It'd be better to also print the postscript length we read and the file size, so users can know whether the file is corrupt (so need data regeneration) or the metadata is stale (so need refresh).
Also, there are some cases where the same kind of exception results in different messages, eg. in the ColumnReader.cc [Decimal64ColumnReader::readBuffer|https://github.com/apache/orc/blob/master/c%2B%2B/src/ColumnReader.cc#L417] throws {code:c++}ParseError("bad read in DoubleColumnReader::next()");{code} on failing to read from the stream while [DoubleColumnReader::readByte|https://github.com/apache/orc/blob/master/c%2B%2B/src/ColumnReader.cc#L1401] throws {code:c++}ParseError("Read past end of stream in Decimal64ColumnReader " + valueStream->getName());{code}
It would be nice to unify these.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)