You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "William Butler (Jira)" <ji...@apache.org> on 2021/12/17 00:25:00 UTC

[jira] [Created] (PARQUET-2109) Parquet Cpp Reader Can Loop Forever If Page Values Overstated

William Butler created PARQUET-2109:
---------------------------------------

             Summary: Parquet Cpp Reader Can Loop Forever If Page Values Overstated
                 Key: PARQUET-2109
                 URL: https://issues.apache.org/jira/browse/PARQUET-2109
             Project: Parquet
          Issue Type: Bug
          Components: parquet-cpp
            Reporter: William Butler
            Assignee: William Butler


If the page header states that there are more values than are actually present in the page, the Parquet CPP can loop forever.  This is because HasNext() will return true but the actual ReadBatch() will have nothing to read and will not change reader state, causing an infinite loop. We first noticed the bug via ScanFileContents(), but this impacts any code that does not check to see if ReadBatch() consumed anything.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)