You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/12/17 00:37:00 UTC

[jira] [Updated] (PARQUET-2109) Parquet Cpp Reader Can Loop Forever If Page Values Overstated

     [ https://issues.apache.org/jira/browse/PARQUET-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated PARQUET-2109:
------------------------------------
    Labels: pull-request-available  (was: )

> Parquet Cpp Reader Can Loop Forever If Page Values Overstated
> -------------------------------------------------------------
>
>                 Key: PARQUET-2109
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2109
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: William Butler
>            Assignee: William Butler
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the page header states that there are more values than are actually present in the page, the Parquet CPP can loop forever.  This is because HasNext() will return true but the actual ReadBatch() will have nothing to read and will not change reader state, causing an infinite loop. We first noticed the bug via ScanFileContents(), but this impacts any code that does not check to see if ReadBatch() consumed anything.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)