You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Norbert Luksa (Jira)" <ji...@apache.org> on 2020/02/14 14:29:00 UTC

[jira] [Assigned] (ORC-601) Add more debug info to error messages in the scanner

     [ https://issues.apache.org/jira/browse/ORC-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Norbert Luksa reassigned ORC-601:
---------------------------------

    Assignee: Norbert Luksa

> Add more debug info to error messages in the scanner
> ----------------------------------------------------
>
>                 Key: ORC-601
>                 URL: https://issues.apache.org/jira/browse/ORC-601
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Norbert Luksa
>            Assignee: Norbert Luksa
>            Priority: Minor
>              Labels: c++
>
> There are some exceptions which would be easier to debug if we had more debug info at hand. For instance, one frequently encountered error when Impala has stale metadata of an ORC file is:
> {code:java}
> Invalid ORC postscript length
> {code}
> It'd be better to also print the postscript length we read and the file size, so users can know whether the file is corrupt (so need data regeneration) or the metadata is stale (so need refresh).
> Also, there are some cases where the same kind of exception results in different messages, eg. in the ColumnReader.cc [Decimal64ColumnReader::readBuffer|https://github.com/apache/orc/blob/master/c%2B%2B/src/ColumnReader.cc#L417] throws {code:c++}ParseError("bad read in DoubleColumnReader::next()");{code} on failing to read from the stream while [DoubleColumnReader::readByte|https://github.com/apache/orc/blob/master/c%2B%2B/src/ColumnReader.cc#L1401] throws {code:c++}ParseError("Read past end of stream in Decimal64ColumnReader " + valueStream->getName());{code}
> It would be nice to unify these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)