You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Csaba Ringhofer (JIRA)" <ji...@apache.org> on 2018/03/19 18:13:00 UTC
[jira] [Created] (PARQUET-1250) RLE decoding should treat 0 length
runs as error
Csaba Ringhofer created PARQUET-1250:
----------------------------------------
Summary: RLE decoding should treat 0 length runs as error
Key: PARQUET-1250
URL: https://issues.apache.org/jira/browse/PARQUET-1250
Project: Parquet
Issue Type: Improvement
Components: parquet-mr
Reporter: Csaba Ringhofer
RunLengthBitPackingHybridDecoder accepts run headers that encode 0 length repeated runs, and treats them as if they were 2^32 length run, so effectively every value returned for that data page will be the same. (see https://github.com/apache/parquet-mr/blob/0a86429939075984edce5e3b8195dfb7f9e3ab6b/parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridDecoder.java#L66 )
Throwing an exception if count is 0 would give a proper error message for some corrupt files, and would make it clear that these are not legal values.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)