You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Csaba Ringhofer (JIRA)" <ji...@apache.org> on 2018/03/19 18:13:00 UTC

[jira] [Created] (PARQUET-1250) RLE decoding should treat 0 length runs as error

Csaba Ringhofer created PARQUET-1250:
----------------------------------------

             Summary: RLE decoding should treat 0 length runs as error 
                 Key: PARQUET-1250
                 URL: https://issues.apache.org/jira/browse/PARQUET-1250
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-mr
            Reporter: Csaba Ringhofer


RunLengthBitPackingHybridDecoder accepts run headers that encode 0 length repeated runs, and treats them as if they were 2^32 length run, so effectively every value returned for that data page will be the same. (see https://github.com/apache/parquet-mr/blob/0a86429939075984edce5e3b8195dfb7f9e3ab6b/parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridDecoder.java#L66 )

Throwing an exception if count is 0 would give a proper error message for some corrupt files, and would make it clear that these are not legal values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)