You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Brian Mwambazi (Jira)" <ji...@apache.org> on 2020/01/14 22:33:00 UTC

[jira] [Resolved] (PARQUET-1768) InternalParquetRecordWriter doesn't immediately limit current row group to thres

     [ https://issues.apache.org/jira/browse/PARQUET-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Mwambazi resolved PARQUET-1768.
-------------------------------------
    Resolution: Duplicate

PARQUET-1767

> InternalParquetRecordWriter doesn't immediately limit current row group to thres
> --------------------------------------------------------------------------------
>
>                 Key: PARQUET-1768
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1768
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: Brian Mwambazi
>            Priority: Minor
>
> The MemoryManager adjust the row group size threshold of writers when the allocated memory pool fills up.
> *Problem*: However InternalParquetRecordWriter only re-adjusts the row group size on the next flush meaning they still use the old size. 
> This opens up a possibility of getting an OOM error if all writers are started at relatively the same time and progress in tandem(I saw this when investigating failing jobs while writing to disk in Spark)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)